Are AI Twins the Future of Research or a Very Sophisticated Shortcut?

The Experience Strategy Podcast | Dave Norton, Joe Pine, and Aransas Savas

Two recent Wall Street Journal articles dropped into the market research world like a stone in still water. The first profiles Simile, a Palo Alto startup spun out of Stanford in 2024, which recently raised $100 million in Series A funding to build AI "twins" modeled on real people for polling and market research. The second profiles Aura, a company founded by teenagers (the technical co-founder was 15 at the time), now valued at over $1 billion, already attracting clients like McDonald's, EY, and A24, with the bold claim that AI bots can predict human behavior better than humans themselves.

On the latest episode of The Experience Strategy Podcast, Dave Norton, Joe Pine, and Aransas Savas bring their combined decades of consumer research experience to the question everyone in insights is quietly asking: Is this the end of primary research, or the beginning of something more powerful?

→ Listen to the episode here

The Flight Simulator Problem

Dave's first instinct was worry, maybe panic 😊. Stone Mantel has built its practice on deep, situational consumer research, the kind that reveals not just what people prefer, but how they behave in the specific moments that matter most. The idea of AI twins answering with "0.5% accuracy" felt, at first, like a threat to everything that work stands for.

But then he found a useful analogy: flight simulators.

Simulators serve a real and important purpose. They're used to train pilots, test scenarios, and stress-test systems in ways that would be dangerous or impossible with a real aircraft. But no one conflates the simulator with the plane. The moment that distinction gets blurry, you have a problem.

That's the frame for AI twins in market research. They can serve a purpose, as long as everyone is clear about what they are and what they aren't.

The Critical Flaw: Behavior Isn't Static

Both Dave and Joe independently land on the same structural problem with current AI twin models: they're built on fixed preferences and demographic profiles. They assume behavior is stable, "this is how soccer moms respond," when the entire premise of situational research is that behavior shifts with context.

What mode is the person in? What situation are they navigating? What are they trying to accomplish right now?

Those questions aren't being asked.

Joe puts it plainly: they didn't ask anything about modes. And if you don't understand the mode, you don't understand the person. You have a data point dressed up as a human (learn more about modes in our previous blog posts: What is A Buying Mode & How Do You Support it? — Stone Mantel and Creating Business Value Through Modes — Stone Mantel).

Where AI Twins Actually Work

Trend prediction and aggregate market analysis are reasonable use cases. Aura's own origin story illustrates this well: the founders first validated their model by forecasting the 2020 elections and landing within 0.5% of the actual results. For Spindrift Beverage, Aura's bots selected a new line of iced teas in a single week, replicating the results of a traditional 500-person survey that would have taken two months.

At a macro level, pattern recognition is exactly what AI does well.

The harder and more valuable problem is understanding what a specific person cares about in specific situations and modes. That requires something current AI twins simply are not equipped to provide.

What They Could Become

Dave raises an intriguing possibility: what if AI twins weren't a replacement for primary research, but an extension of it? After completing deep qualitative work with a real consumer, that data becomes the seed for ongoing simulation and modeling. Not instead of the research, as a way to extend its value across time and decisions.

It's a compelling idea, and a genuinely useful one. But Dave also flags the risk: every feedback loop that improves AI accuracy may also drift the model further from the original human signal. The twin becomes less like the person it was built from, and no one notices.

At Stone Mantel, we're actively exploring exactly this direction, but with a critical design difference. The AI digital twins we're developing for clients are built with contextual, situational, and mode filters baked in from the start. Rather than modeling a person as a fixed set of preferences, the goal is to model how people behave differently depending on what they're trying to accomplish, what situations they're in, and what modes they're operating from. That's the missing layer in current AI twin models, and it's the layer that makes the difference between a demographic profile and a genuinely useful simulation of human behavior.

Joe's Wall-E Scenario

The Terminator isn't what worries Joe. Wall-E is.

Personal language models living in your Alexa, learning everything you say and do, eventually making purchasing decisions on your behalf, and research shifting its focus from you to the PLM (personal language model) that represents you. Consumers with no agency, guided entirely by AI intermediaries aligned with the companies serving them.

It's not a far-fetched scenario. It's a direction already visible in the data.

The Consent Problem and What CVS Actually Did with Simile

The podcast raises a pointed question about how 400,000 people came to exist as AI twins. It's worth knowing the specifics, because the details matter.

CVS Health, which invested in Simile through its venture capital arm and is among the company's flagship customers, built its AI twin program through AI-moderated interviews with real people. Participants answered detailed questions about their healthcare experiences, preferences, and behaviors across more than 200 behavioral scenarios. CVS also pulled in demographic information participants had previously shared, including data like political affiliation. In total, the program generated nearly 3 million consented responses that now power more than 100,000 digital twins.

CVS describes the value clearly: the twins allow teams to test workflows, messages, and service designs with hard-to-reach populations, like patients with chronic conditions, that are slow and expensive to recruit through traditional research.

The consent claim is technically accurate. But Aransas raises the right question: was it some very fine print? When someone fills out a health survey, answers a customer support interaction, or responds to a pharmacy questionnaire, are they genuinely imagining that their responses will be used to build a permanent digital replica of themselves, one that can be interviewed indefinitely, without them in the room?

Technical compliance and genuine informed consent are not the same thing. And in a domain as sensitive as healthcare, that gap deserves serious scrutiny.

One detail CVS hasn't shared publicly: how long were those original interviews? We know the AI-run research studies that query the twins take 15 to 30 minutes. But the interviews that built the twins in the first place, the ones that generated nearly 3 million responses from 400,000 real people, their duration hasn't appeared in any published coverage. That's a surprisingly basic piece of information to be missing. Did participants sit through a 10-minute survey? An hour-long conversation? Did they know they were being interviewed specifically to train an AI replica of themselves, or did it feel like a routine customer feedback session? Those questions matter enormously to what "consent" actually means here, and the fact that no one seems to be asking them is itself worth noting.

The Stakes

Sameer Munshi, Head of Behavioral Sciences at EY, is quoted saying: "If you can predict behavior, this isn't just an accelerator for research. This is strategy."

That framing captures what's actually at stake here. Dave sees it through the lens of superpowers: AI genuinely gives companies the ability to do things they couldn't do otherwise. The question is whether what they're doing actually reflects how real humans behave or a simulation of humans that's drifted so far from the source it's become its own thing.

That's not a technical question. It's an experience strategy question.

Want to go deeper? Dave, Joe, and Aransas explore all of this, and more, on the latest episode of The Experience Strategy Podcast.

→ Listen now

Stone Mantel is a 20-year experience strategy research and consulting firm. The Experience Strategy Podcast is hosted by Dave Norton, Joe Pine, and Aransas Savas.

Next
Next

Something Big Is Happening — And Experience Strategists Need to Pay Attention