The Problem with Patient Research
Patient research has always been harder than consumer research. Recruitment is slow and expensive. IRB approvals add weeks or months. Response rates are declining. And the patients who do participate tend to skew toward the highly engaged — not the average person managing a chronic condition.
The result is a persistent gap: the organizations that need patient insights most — pharma companies, hospital systems, medical device manufacturers, health equity researchers — face the highest barriers to getting them.
Today we are launching the Simsurveys Patient model, our fourth domain-specific AI model, purpose-built for patient research. It joins our existing Healthcare (HCP), Consumer, and Social Research models.
Built on Federal Health Data
The Patient model is trained on more than 500,000 de-identified patient records drawn from six publicly available federal health datasets. Every record is non-PHI, openly published by federal agencies, and available for research use.
NHIS
National Health Interview Survey — ~30,000 respondents per year covering health status, access, insurance, and chronic conditions. Includes adult and child components.
BRFSS
Behavioral Risk Factor Surveillance System — 450,000+ annual responses on health behaviors, preventive practices, and chronic disease prevalence.
MEPS
Medical Expenditure Panel Survey — 18,000+ respondents on healthcare costs, utilization, insurance, and patient experiences.
NHANES
National Health and Nutrition Examination Survey — ~15,000 respondents combining interviews with physical examinations and lab data.
These are supplemented by CAHPS (Consumer Assessment of Healthcare Providers and Systems) hospital experience data and the PROMIS measurement system — 2,078 calibrated items covering physical, mental, and social health outcomes. Critically, NHIS includes both adult and child components, giving the model coverage of pediatric populations — one of the hardest segments to recruit for in traditional patient research.
No panel partner required. Because the Patient model is grounded in publicly available federal datasets, researchers can generate patient survey data without engaging a panel company, navigating patient recruitment pipelines, or waiting for IRB timelines. The data is already de-identified and cleared for research use at the source.
What It Does
The Patient model generates synthetic patient responses across the full range of patient research topics. It works with all three Simsurveys generation modes — synthetic, augmented, and expanded data — and supports the same question types as our other domain models.
Patient Experience
Hospital experience, provider communication, care coordination, and satisfaction metrics aligned with CAHPS frameworks.
Treatment & Medication
Drug attitudes, treatment satisfaction, medication adherence, side effect tolerance, and switching behavior.
Chronic Disease
Condition-specific research for diabetes, cardiovascular disease, respiratory conditions, chronic pain, and mental health.
Health Equity
Access disparities, social determinants of health, insurance coverage gaps, and underserved population research.
Condition Targeting
One of the most powerful features of the Patient model is condition-based targeting. Researchers can specify conditions — Type 2 diabetes, COPD, chronic pain, depression — and the model generates respondents whose health profiles, comorbidities, treatment patterns, and care experiences are statistically consistent with real patients managing those conditions.
This means you can run a study among chronic pain patients, or patients with multiple comorbidities, without the recruitment challenges that typically make these populations hard to reach at scale.
Validation Results
We validated the Patient model against three published benchmark surveys, following the same rigorous validation framework we use across all Simsurveys models.
- KFF GLP-1 Weight Loss Drug Survey: Validated against the Kaiser Family Foundation's 2023 survey on GLP-1 drug awareness, usage, and attitudes across demographic segments.
- HCAHPS Hospital Patient Experience: Validated against CMS hospital experience data — all 19 tested questions achieved KL divergence below 0.15, with most below 0.08.
- US Pain Foundation 2022 Survey: Validated against the national chronic pain survey on pain management, treatment satisfaction, and quality-of-life impact.
All three validation reports are available for download on our publications page, with full distribution tables, metric summaries, and subgroup analyses.
Independent validation: Our approach is supported by independent peer-reviewed research. Toubia et al. (2025), published in Marketing Science, demonstrated that LLM-based respondent models can reproduce survey responses with high fidelity across 2,000 respondents and 500 questions.
Applications
The Patient model is designed for research teams across pharma, health systems, payers, and medical devices. Common applications include:
- Pharma patient experience studies — drug attitudes, adherence patterns, treatment satisfaction, and switching behavior
- Hospital quality measurement — CAHPS-aligned patient experience metrics for benchmarking and improvement
- Condition-specific research — targeted studies for specific disease areas without rare-population recruitment challenges
- Health equity research — access disparities, social determinants, and underserved population insights
- Medical device research — patient preferences, usability feedback, and adoption drivers
Getting Started
The Patient model is available now to all Simsurveys users. Visit the Patient model page to learn more, download our technical white paper, or create your free account to run your first patient study.