Digital Twins from Conjoint Data: How Individual-Level Utilities Become Persistent Respondents

Product · April 16, 2026 · Myles Friedman · 10 min read

Conjoint analysis is the gold standard for measuring how people make tradeoffs. It tells you exactly how much a respondent values price versus efficacy versus brand versus convenience, at the individual level. No other methodology gives you that kind of precision on preference structure.

But conjoint studies have a structural limitation. They are one-shot. You design the study, field it, estimate utilities, run your simulations, and then the panel is gone. If you want to ask a follow-up question, test a new message, or evaluate a product configuration you did not include in the original design, you have to recruit a new panel and start over.

That is a waste. Those individual-level utility vectors contain an extraordinary amount of information about each respondent's preferences. They should not be sitting in a spreadsheet collecting dust after the final report ships. They should be working for you on every research question that follows.

This is what conjoint-seeded digital twins do. Sometimes called synthetic personas, these twins carry individual-level preference data from the original study and turn every respondent into a persistent, queryable entity that preserves that person's exact preference structure. You run the conjoint once. Then you re-query the same panel as many times as you want, on any topic, at zero incremental fielding cost.

The Problem: Conjoint Data Dies After the Simulator

A typical conjoint study follows a predictable lifecycle. You spend weeks on study design, choosing attributes and levels. You spend more time and money recruiting respondents and fielding the choice tasks. Then you run HB-MNL estimation to extract individual-level part-worth utilities for every respondent. You build a market simulator. You deliver the report.

And then the data goes cold.

The simulator answers the specific questions it was designed to answer: which product configuration wins the most share, how price changes affect preference, what happens if a competitor enters with a specific profile. But it cannot answer questions outside the original attribute space. It cannot tell you which marketing message would resonate with respondent #247. It cannot evaluate a product concept that uses different language than the conjoint levels. It cannot run a follow-up survey on treatment satisfaction or brand perception.

The individual-level utilities are rich, precise, and expensive to collect. But their useful life ends when the simulator is built. Everything you learn about each respondent's preference structure is locked inside a narrow set of predefined attributes and levels.

The Solution: Turn Conjoint Respondents into Persistent Digital Twins

A digital twin is a persistent AI model of a specific person. It carries that person's demographics, attitudes, and preference structure, and it can answer new questions as if it were that individual. The twin is not a static utility vector. It is a living respondent that can engage with any research question you throw at it.

When you seed a digital twin from conjoint data, the twin inherits the respondent's individual-level utility vector as its preference backbone. But it goes beyond the conjoint. The twin can reason about topics outside the original attribute space because it understands the person's underlying preference structure, not just the specific levels they evaluated.

A price-sensitive respondent stays price-sensitive when you ask about messaging. A brand-loyal respondent evaluates new concepts through a brand-first lens. The preference structure transfers because the twin carries a semantic understanding of who this person is and how they make decisions.

How It Works: From Choice Tasks to Persistent Twins

The pipeline from conjoint data to digital twins has four distinct steps. Each one builds on the last.

Step 1: Run the Conjoint Study

This is a standard choice-based conjoint (CBC). Respondents see a series of choice tasks, each presenting two or more product profiles defined by a set of attributes (price, brand, efficacy, dosing frequency, whatever is relevant to your category). They choose their preferred option in each task. The design is typically D-optimal or balanced overlap to maximize the statistical information extracted from each respondent.

Nothing here is different from what you would do in any conjoint study. The key is that you need enough choice tasks per respondent (typically 12-20) to support individual-level estimation.

Step 2: Estimate Individual-Level Utilities via HB-MNL

This is where the magic happens. Hierarchical Bayesian Multinomial Logit (HB-MNL) estimation uses Gibbs sampling and MCMC (Markov Chain Monte Carlo) to estimate a unique utility vector for every individual respondent. Unlike aggregate MNL, which gives you a single set of population-level utilities, HB-MNL produces respondent-specific part-worths by borrowing strength across the sample while preserving individual variation.

The output is a matrix: one row per respondent, one column per attribute level. Respondent #247 might have a utility of 2.1 for the lowest price tier, 0.3 for the mid-tier, and -2.4 for the premium tier, meaning they are strongly price-sensitive. Respondent #391 might show the opposite pattern, with strong positive utilities for the premium brand and near-zero sensitivity to price. These are not segment averages. They are individual preference structures estimated from observed choice behavior.

Step 3: Translate Utilities into Structured Preference Profiles

This step is what separates a useful digital twin from a raw data export. LLMs reason well over natural language descriptions of preferences. They reason poorly over raw numeric utility matrices. If you hand a language model a vector like [2.1, 0.3, -2.4, 1.8, -0.6], it has no reliable way to use those numbers to generate consistent, preference-aligned responses.

The translation step converts each respondent's utility vector into a structured semantic profile. Instead of raw numbers, the twin gets a natural language description of who this person is and what they value:

Price sensitivity: "Strongly prefers low-cost options. Willing to sacrifice brand prestige and some efficacy for a lower price point. Price is the dominant decision driver."
Brand orientation: "Low brand sensitivity. Does not differentiate meaningfully between branded and generic options when price and efficacy are comparable."
Efficacy weighting: "Moderate efficacy sensitivity. Values efficacy but will trade some clinical benefit for meaningful cost savings."
Convenience factors: "Slight preference for less frequent dosing, but convenience is a secondary consideration behind price and efficacy."

This translation preserves the rank ordering and relative magnitude of the original utilities while giving the AI model a representation it can actually use. The profile becomes a persona document that conditions every response the twin generates.

Step 4: The Profile Becomes a Persistent Digital Twin

Once the semantic profile exists, the twin is live. It carries the respondent's demographics (captured during the conjoint screener), their preference profile (derived from the utility translation), and any additional context from the original study (open-ended responses, screening questions, attitudinal items).

The twin is persistent. It exists as a queryable entity that can be re-surveyed at any time. Ask it a new question next week or next quarter, and it will answer from the same preference foundation. The responses are consistent because the underlying profile does not change between queries.

What You Can Do with Conjoint-Seeded Twins

Once you have converted your conjoint panel into digital twins, the research possibilities expand well beyond the original study scope. Here are the highest-value applications.

Message Testing

Test which messages resonate with different preference segments by querying the twins directly. Show the price-sensitive cluster a value-focused message and a premium message. Show the brand-loyal cluster the same messages. Measure which framing each group responds to, and why.

This is powerful because the segments are defined by actual choice behavior, not self-reported attitudes. You know respondent #247 is price-sensitive because their revealed preferences from the conjoint show it, not because they checked a box saying "price is important to me."

Concept Testing

Evaluate new product configurations against the same panel that did the original conjoint. Want to test a product concept that combines attributes in a way the conjoint did not include? The twins can evaluate it. Want to test 20 concepts instead of the 5 you simulated? Go ahead. There is no incremental fielding cost.

The twins evaluate each concept through the lens of their individual preference structure, giving you individual-level concept scores rather than just segment averages.

Follow-Up Surveys

Ask entirely new questions without re-fielding. The original conjoint might have focused on product attributes, but now you need to understand treatment satisfaction, information sources, or competitive perceptions. The twins can answer these questions because they carry a complete preference profile, not just the conjoint attributes.

This is where the economics become compelling. A single conjoint study that cost $150K to field can generate an unlimited number of follow-up studies at a fraction of the cost. The panel never expires, never has attrition, and never needs re-recruitment.

Segment Deep-Dives

Query specific attitudinal or behavioral clusters in depth. Identify the 15% of respondents who are most price-sensitive and run a dedicated study just on that segment. Or find the respondents whose utility profiles suggest they are on the fence between two product configurations and explore what would tip them one way or the other.

With live panels, this kind of targeted follow-up is prohibitively expensive. You would need to re-contact specific respondents, hope they are still available, and pay premium incentives for the callback. With digital twins, you just filter and query.

Why Individual-Level Matters

The entire value of this approach depends on individual-level estimation. If you only had aggregate utilities (population-level averages from a standard MNL model), the digital twins would all be the same. Every twin would carry the average preference structure, and there would be no differentiation between respondents.

HB-MNL solves this by producing a unique utility vector for each respondent. The Bayesian framework borrows strength from the population distribution to stabilize individual estimates, which means you get reliable individual-level data even with a moderate number of choice tasks per person.

This individual-level precision is what makes the twins useful. When you run a message test, you do not get a single answer. You get 500 different answers from 500 different preference profiles. You can see exactly which messages work for which types of respondents, and you can segment the results by any dimension of the underlying preference structure.

Group averages hide the variation that matters most. A message that scores 7.2 on average might score 9.1 among price-sensitive respondents and 4.8 among brand-loyal ones. Without individual-level twins, you would never see that split. You would just see the average and make a suboptimal decision.

Example: 500 Physicians, One Conjoint, Unlimited Studies

Consider a pharmaceutical company that runs a conjoint study with 500 physician respondents across cardiology, endocrinology, and primary care. The study evaluates treatment attributes for a cardiovascular drug: efficacy (LDL reduction), safety profile, dosing frequency, price/copay tier, and brand.

After HB-MNL estimation, the company has 500 individual utility vectors. The standard deliverable is a market simulator that answers questions about share of preference across predefined product configurations.

With conjoint-seeded digital twins, those 500 physicians become a persistent panel. Here is what the team does with them over the following six months:

Month 1: The brand team tests 8 different messaging concepts against the full panel. They discover that efficacy-first messaging works best with cardiologists, while PCPs respond better to messages that lead with simplicity and dosing convenience. This insight comes directly from the conjoint-derived preference profiles.
Month 2: A competitor announces a new formulation. The team queries the twins to simulate how the competitor's entry would shift prescribing intent across specialties. They can do this in hours instead of weeks because the panel is already built.
Month 3: The medical affairs team wants to understand barriers to guideline adherence among the same physician population. They run a follow-up survey on the twins, asking questions that were never part of the original conjoint.
Month 5: The pricing team needs to evaluate a new copay assistance program. They query the price-sensitive segment specifically, testing different assistance thresholds against physicians whose conjoint data shows high price sensitivity.
Month 6: The team is preparing for an advisory board and wants to pre-test discussion topics. They query a subset of KOL-profile twins to identify which topics generate the most varied perspectives.

All six studies use the same 500 physician twins, seeded from a single conjoint. The original study cost six figures to field. The follow-on studies cost a fraction of that, with no recruitment, no scheduling, and no attrition.

The Translation Step Is the Key Innovation

If there is one takeaway from this piece, it is that the translation from numeric utilities to semantic preference profiles is what makes the whole system work. Without it, you just have a spreadsheet of numbers. With it, you have persistent AI respondents that reason about new questions using the same preference structure that was revealed in the conjoint.

This is a non-obvious step. Most people who think about using AI with conjoint data imagine feeding the raw utility matrix to a model and asking it to extrapolate. That does not work. Language models are not designed to reason over numeric parameter vectors. They hallucinate patterns, ignore magnitudes, and produce inconsistent results.

The structured semantic profile solves this by meeting the model where it is strong. LLMs are excellent at maintaining persona consistency when given a well-defined character description. They are excellent at reasoning about tradeoffs when the tradeoff structure is described in natural language. The translation step converts conjoint output into exactly the format that AI models handle best.

This is also why the digital twin approach works better than simply prompting a model to "act like a price-sensitive cardiologist." The twin does not rely on the model's generic training data about cardiologists. It carries a specific, quantitatively-derived preference profile that was estimated from real choice data. The twin is grounded in observed behavior, not stereotypes.

Getting Started

If you have conjoint data sitting in a drawer, whether from a study you completed last month or last year, it can be converted into a persistent digital twin panel. The Simsurveys platform handles the full pipeline: utility ingestion, semantic translation, twin creation, and ongoing querying.

If you are planning a new conjoint study, consider designing it with twin creation in mind from the start. Include a few additional screener items and attitudinal questions to enrich the preference profiles beyond the core conjoint attributes.

For a deeper look at how synthetic conjoint analysis works as a standalone methodology, see our post on synthetic conjoint analysis. For a broader overview of how digital twins fit into market research, see our guide on digital twins for market research.

The core insight is straightforward. Conjoint data is the best source of individual-level preference data in market research. That data should not be used once and discarded. It should become the foundation for a persistent research asset that compounds in value with every question you ask.

Reach out to discuss how your conjoint data can become a persistent digital twin panel.