Your Reviews Replicate You: LLM-Based Agents as Customer Digital Twins for Conjoint Analysis
Conjoint analysis is a cornerstone of market research for estimating consumer preferences; however, traditional methods face persistent challenges regarding time, cost, and respondent fatigue. To address these limitations, this study proposes a framework that utilizes large language model (LLM)-based "customer digital twins (CDT)" as virtual respondents. We identified active users within the Reddit community and aggregated their comprehensive review histories to construct individualized vector databases. By integrating retrieval-augmented generation (RAG) with prompt engineering, this study developed customer agents capable of dynamically retrieving and reasoning upon their specific past preferences and constraints. These customer agents, called CDTs, performed pairwise comparison tasks on product profiles generated via fractional factorial design, and the resulting choice data was analyzed to estimate part-worth utilities by logistic regression. Empirical validation demonstrates that these CDTs predict the preferences of actual users with 87.73% accuracy. Furthermore, a case study on the computer monitor category successfully quantified trade-offs between attributes such as panel type and resolution, deriving preference structures consistent with market realities. Ultimately, this study contributes to marketing research by presenting a scalable alternative that significantly improves both agility and cost-efficiency to traditional methods.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Information Retrieval arXiv:2604.22756 (cs) [Submitted on 6 Mar 2026] Title:Your Reviews Replicate You: LLM-Based Agents as Customer Digital Twins for Conjoint Analysis Authors:Bin Xuan, Jungmin Hwang, Hakyeon Lee View a PDF of the paper titled Your Reviews Replicate You: LLM-Based Agents as Customer Digital Twins for Conjoint Analysis, by Bin Xuan and 2 other authors View PDF Abstract:Conjoint analysis is a cornerstone of market research for estimating consumer preferences; however, traditional methods face persistent challenges regarding time, cost, and respondent fatigue. To address these limitations, this study proposes a framework that utilizes large language model (LLM)-based "customer digital twins (CDT)" as virtual respondents.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.