FAQ

Frequently Asked Questions

Answers to common questions about synthetic survey data, validation methodology, the Simsurveys platform, Oracle API, and pricing.

About Synthetic Survey Data

What is synthetic survey data?

Synthetic survey data is research data generated by AI models that simulate how real human respondents would answer survey questions. Simsurveys uses domain-specific machine learning models trained on millions of validated survey responses to produce statistically representative results that match live panel benchmarks, typically within 80-90% accuracy across key metrics.

How accurate is synthetic survey data compared to traditional panels?

Simsurveys synthetic data consistently achieves 80-90% alignment with live panel benchmarks across nine completed validation studies. Accuracy is measured using KL divergence and other statistical tests that compare response distributions. While not identical to live panels, synthetic data delivers research-grade insights at a fraction of the cost and turnaround time.

How does Simsurveys validate its synthetic data?

Simsurveys runs the same survey instrument through both its AI models and live human panels, then compares the response distributions using statistical measures including KL divergence. Each domain model undergoes rigorous validation before release. All validation reports are published openly so researchers can evaluate the methodology and results for themselves.

What is KL divergence and why does it matter?

KL divergence (Kullback-Leibler divergence) is a statistical measure that quantifies how one probability distribution differs from another. Simsurveys uses it to compare synthetic response distributions against live panel distributions. Lower KL divergence values indicate closer alignment. It is the primary metric used in our validation studies to ensure synthetic outputs meet research-grade standards.

Can synthetic data replace live panels entirely?

Synthetic data is best used as a complement to live panels, not a full replacement. It excels at rapid concept testing, early-stage research, and budget-constrained projects where traditional panels are impractical. For high-stakes regulatory research or studies requiring verbatim human responses, live panels remain the gold standard. Simsurveys is transparent about these limitations.

About the Platform

How does the Simsurveys platform work?

You create a survey using our web-based platform, select a target demographic and domain model, and launch your study. Our AI models generate synthetic respondents who complete the survey based on trained behavioral patterns. Results are typically available within minutes. You can then analyze data in the platform dashboard or export it for use in your preferred analytics tools.

What question types does Simsurveys support?

Simsurveys supports multiple-choice single select, multiple-choice multi-select, Likert scales, ranking questions, and matrix/grid formats. These cover the vast majority of quantitative survey research needs. Open-ended text responses are not currently supported because synthetic verbatim text carries different accuracy characteristics than structured response data.

How long does it take to get results?

Most studies return complete results within minutes of launching. This is one of the primary advantages over traditional panels, which typically require days or weeks for fieldwork. The exact time depends on study complexity and sample size, but even large studies with thousands of synthetic respondents are generally completed in under an hour.

What export formats are available?

Simsurveys supports CSV and Excel export formats, which are compatible with all major analytics tools including SPSS, R, Python, Tableau, and Excel. Data exports include response-level detail with demographic attributes and weighting variables. You can also access results programmatically through the Oracle API for automated workflows and integrations.

About Oracle API

What is the Oracle API?

The Oracle API is Simsurveys' programmatic interface that allows developers and AI agents to query consumer preferences and opinions on demand. It serves as a real-time preference layer for applications that need to understand what people think, want, or would choose. The API returns structured survey-style data without requiring a manual survey setup process.

What is the "preference layer" for agentic commerce?

The preference layer is a concept where AI agents access real-time consumer opinion data to make better purchasing and recommendation decisions on behalf of users. Instead of relying on historical data or reviews alone, agents query the Oracle API to understand current preferences, attitudes, and decision factors for any product category or consumer question.

How do AI agents use the Oracle API?

AI agents send natural-language queries to the Oracle API and receive structured preference data in return. For example, a shopping agent could ask what features consumers prioritize when choosing a laptop and receive statistically grounded response distributions. This enables agents to make recommendations that reflect actual consumer preferences rather than relying solely on product specifications or reviews.

About Domain Models

What domain models does Simsurveys offer?

Simsurveys currently offers domain-specific AI models for healthcare, consumer research, patient experience, and social research. Each model is independently trained and validated against live panel data in its respective domain. Domain models capture the unique response patterns, terminology awareness, and behavioral characteristics specific to each research area.

Can I create a custom AI model for my industry?

Yes. Simsurveys offers custom domain model development for enterprise clients with specialized research needs. Custom models are trained on industry-specific data and validated against your existing panel benchmarks. The process typically involves a discovery phase, model training, and a rigorous validation cycle. Contact our sales team to discuss your requirements and timeline.

Pricing & Getting Started

How much does Simsurveys cost?

Simsurveys delivers research-grade survey data at approximately one-tenth the cost of traditional panel research. Pricing varies by study complexity, sample size, and domain model. We offer both per-study pricing and subscription plans for organizations with ongoing research needs. Contact sales for a detailed quote based on your specific research requirements.

Is there a free trial?

Yes. You can create a free account and run your first study without a credit card. The free tier allows you to experience the full platform workflow, including survey creation, synthetic data generation, and results analysis. This lets you evaluate data quality and platform capabilities before committing to a paid plan for larger or ongoing research projects.

What's included in each study?

Each Simsurveys study includes survey creation tools, AI-generated synthetic respondents from your chosen domain model, a real-time results dashboard with cross-tabulation capabilities, and data export in CSV and Excel formats. Demographic targeting, response weighting, and statistical summaries are built into every study at no additional cost.

Methodology & Trust

How many validation studies has Simsurveys completed?

Simsurveys has completed nine validation studies across its domain models, comparing synthetic output against live panel benchmarks. Each study runs identical survey instruments through both AI models and human panels, then measures alignment using KL divergence and other statistical tests. All validation reports are published on our website for full transparency.

Is synthetic survey data ethical?

Simsurveys takes the ethics of synthetic data seriously. Our models do not replicate or expose individual responses. All synthetic outputs are clearly labeled as AI-generated and never misrepresented as human data. We publish responsible use guidelines and are transparent about capabilities and limitations. No personal data is collected or stored from real respondents during the generation process.

What are the limitations of synthetic survey data?

Synthetic survey data has known limitations. It is less suited for open-ended qualitative research, highly niche populations with limited training data, or regulatory submissions requiring human respondent data. Model accuracy depends on training data quality and domain coverage. Simsurveys publishes these limitations openly and recommends live panels for use cases where synthetic data is not yet validated.

Still Have Questions?

Create your free account and generate your first research study in minutes, or reach out to our team for a personalized walkthrough.