A synthetic assembly of minds

Rehearse
reality.

GroupThinQ summons living populations of AI minds — each with a demography, a psychology, and an honest opinion — so you can hear how the world will answer before you ask it.

Run the Demo

Watch a society think

~90% of human panel reliability·validated on 9,300 real responses·verdicts in minutes, not weeks

click anywhere — watch an opinion ripple through the society

“I’d buy it twice — once to keep.”Mind №214 · 26 · Pune“My grandmother made this better, for free.”Mind №078 · 61 · Jaipur“The font alone loses me.”Mind №142 · 33 · Berlin“Finally, someone priced it like they mean it.”Mind №391 · 41 · Austin“This is a solution looking for a problem.”Mind №057 · 58 · Coimbatore“Send it to my sister. Don’t tell her I said that.”Mind №233 · 29 · Mumbai“I read “artisanal” and stopped reading.”Mind №310 · 47 · Leeds“Okay. The packaging would end up on my stories.”Mind №112 · 24 · Delhi

Standing on published science

Stanford UniversityGoogle DeepMindPyMC Labs × Colgate-Palmolive90+ peer-reviewed sourcesAAPOR-aligned methodology

The problem

Every launch is a wager
placed in the dark.

You can build the product, write the copy, set the price. But the only way to learn what an audience actually thinks has been to wait, pay, and hope — so most teams simply don’t ask.

Six weeks of silence

A traditional concept test takes four to eight weeks to field, code, and report. Your market does not pause while you wait for permission to act.

0 weeksto field one study

A toll only giants pay

At $30,000–$80,000 per study, research is rationed. The decisions that most needed testing are the ones that never got it — instinct fills the gap.

$0kper concept test

The room lies

Focus groups answer to the loudest voice in them. People perform agreement in public and change their minds in private. You measured the theater, not the preference.

0 voiceoften decides for 8

0% of new products fail. Most of those verdicts were available in advance — nobody could afford to hear them.

The simulation — live on this page

A society in miniature,
summoned on demand.

Pick a stimulus. The same three minds — drawn from real census and psychometric distributions — will read it and answer honestly. They are free to be bored. They are free to say no.

Stimulus № 047 · Concept test

“Saffron Press — a cold-brew chai subscription. ₹299 a month, delivered every Sunday morning.”

Target audience · Urban India · 22–60 · mixed income

Meera Krishnan

31 · UX designer · Bengaluru

High OpennessBudget-aware

“The Sunday-ritual angle is genuinely lovely. But ₹299 is two café visits I already enjoy. I’d trial a month — keeping me past the novelty is the hard part.”

Curious, not committed3.4 / 5

Rajan Iyer

58 · Retired banker · Coimbatore

High ConscientiousnessHabit-led

“I have made my own filter coffee every morning for thirty years. Why would I pay a stranger to post me chai? A solution looking for a problem I don’t have.”

Firm no1.6 / 5

Ananya Bose

24 · Content strategist · Mumbai

High ExtraversionTrend-forward

“Okay — the packaging alone would end up on my stories. If the first box tastes as good as it sounds, I’m subscribed, and three of my friends are too.”

Enthusiastic yes4.3 / 5

Panel verdict · 300 minds

17%

29%

27%

18%

Definitely notDefinitely yes

38%

Top-2-box intent

Strong with metro 22–34. Collapses past 45. Price is the pivot.

Tier B · rank-order reliable

Illustrative output — in the demo, every panel is generated fresh against your demographic

The method

From census to agora
in five movements.

Summon the population

Census margins · correlated traits

Describe an audience - "urban Indian women, 25-40, mid-income" - and GroupThinQ assembles statistically faithful minds. Age, education, and income follow national census margins; the Big Five are drawn together as a correlated multivariate distribution, so the synthetic crowd keeps the joint structure real people carry.

Stage the stimulus

Text · image · price · A/B/n

A product concept, an ad, a price point, a tweet, an image - the panel sees exactly what your audience would see. No more, no less. Test one idea, or pit five against each other.

III

Let every mind speak

Independent · sampled · plural

Each agent reacts privately, in its own voice - no moderator, no pressure to agree. When a category needs more breadth, Verbalized Sampling asks for probability-weighted sets of possible reactions, recovering diversity that ordinary prompting tends to flatten.

Weigh the verdict

SSR · raking · intervals · Kish n

Reactions are scored with Semantic Similarity Rating, then post-stratified by IPF raking against the target population. Every verdict ships with a bootstrap 95% confidence interval, Kish effective sample size, and a computed B/C validation tier with variance-collapse diagnostics.

Convene the agora

Networked opinion · cascade risk

A final social-dynamics pass lets opinions meet. Friedkin-Johnsen dynamics run over a small-world network with trait-derived conformity, then confidence-weighted pooling shows how word of mouth moves the market - and where a cascade may begin.

The evidence

Accuracy is the product.

Synthetic research lives or dies on one question: does the simulation predict reality? We hold ourselves to the published record, preserve disagreement where markets are made, and report every number against the ceiling of human self-agreement.

The headline result

“Semantic Similarity Rating reaches roughly 90% of human test-retest reliability when synthetic purchase intent is judged against real survey response.”

Maier et al. 2025 · arXiv:2510.08338 · human purchase-intent dispersion as the audit

SSR verdicts vs human reliability90%

Naive AI prompting preserving dispersion42%

Two human panels agreeing (the ceiling)100%

Why most AI panels fail

Ask an AI to “rate this 1-5” and opinion collapses to the middle: everyone mildly likes everything, while the spread that real markets are made of disappears. Matching trait marginals alone is no cure; synthetic panels can still break the joint structure of real populations.

GroupThinQ now draws the Big Five as a correlated multivariate distribution, using meta-analytic inter-trait correlations rather than five isolated dice rolls. Verbalized Sampling can then elicit probability-weighted sets of reactions, recovering 0-0x of the diversity ordinary mode collapse loses.

van der Linden et al. 2010 · Williams et al. 2026 · Zhang et al. 2025

Marble portrait bust of the emperor Gaius — The Met, public domain

Every verdict is auditable: ask any mind why it voted, then watch the agora move it.

Marble portrait of Gaius · The Met · public domain

Every number ships with its uncertainty

95%

Bootstrap interval

Every verdict carries a bootstrap confidence interval, so a winning concept must clear uncertainty rather than merely decorate it.

B/C

Computed tier

The engine assigns the validation tier from evidence on the category: B for rank-order confidence, C for directional screening.

Variance watch

Variance-collapse diagnostics compare synthetic purchase-intent dispersion against human dispersion before the result is allowed to speak loudly.

Use cases

If it will ever meet an audience,
rehearse it here.

Product

Concept testing

Rank five product directions before a rupee of engineering is spent. Find the winner — and the segment that crowns it.

“Which of these three flavors wins with health-conscious millennials in Tier-1 cities?”

Marketing

Copy & creative

A/B/n test headlines, taglines, and campaign concepts against the exact demographic the media buy will reach.

“Test five headlines for the relaunch against urban professionals, 25–34.”

Pricing

Price sensitivity

Watch intent bend as the price moves. Find the point where enthusiasm breaks — segment by segment.

“Where does top-2-box intent fall below 30% for SMB buyers?”

Content

Posts before posting

Hear how a tweet, a reel, or an announcement lands across audiences — including the ones it will annoy — before it ships.

“How will founder-stage CTOs receive this pricing-change post?”

Research

Survey rehearsal

Pilot your questionnaire on synthetic respondents first. Catch confusing wording and broken scales before the field bill arrives.

“Run the CSAT redesign past 200 respondents matching our base.”

Strategy

Segment cartography

Map where your audience fractures — which psychological and demographic fault lines split the verdict on the same idea.

“Show me who loves and who hates the rebrand, and why.”

Our covenant

We would rather be trusted
than impressive.

Synthetic research earns its seat by knowing its own limits. The industry’s overclaims are our moat: we publish what works, disclose what doesn’t, and stamp every number with the confidence it deserves.

Veristic portrait, 1st c. B.C. — the Roman fashion for being shown exactly as you are.

The Met · public domain

What we stand behind

Rank-ordering of ideas within validated categories — the decision that matters most, made reliable
Directional effects and segment differences, anchored to human data wherever it exists
A rehearsal layer that makes your human research cheaper, sharper, and better targeted
Every output traceable: ask any agent to explain its verdict

What we refuse to claim

We never report fake margins of error — synthetic panels are not probability samples
We don’t predict virality; no method on earth reliably can, and we won’t pretend otherwise
We don’t sell uncalibrated point estimates for narrow subgroups as facts
We are not a replacement for human research on high-stakes, irreversible decisions

Methodology designed for AAPOR-era disclosure standards · radical transparency is the credibility wedge

The assembly
is waiting.

Summon a population. Stage your idea.
Hear three hundred minds answer.

Enter the Demo

Runs locally · no account · full demo mode without API keys

Rehearsereality.

Every launch is a wagerplaced in the dark.