What stops AI from helping users articulate preferences they cannot express?
This explores the 'gulf of envisioning' — why AI struggles to help users surface preferences they can't yet put into words, and where the failure actually lives.
This explores the 'gulf of envisioning' — the gap between what users want and what they can articulate — and asks where AI breaks down in helping close it. The corpus points to a clear answer: the bottleneck isn't model intelligence, it's interaction design. Intent isn't a fixed thing waiting to be retrieved; it matures through dialogue. How do users actually form intent when prompting AI systems? frames intent as continuous maturation through 'progressive constraint resolution' — not present-or-absent, but slowly resolving — and AI systems fail partly because they can't see inside a user's evolving cognitive state. Why can't users articulate what they want from AI? sharpens the diagnosis: models respond rather than probe, so they miss the chance to help users discover requirements they didn't know they had. The fix it gestures at is counterintuitive — instead of asking users to envision freely (hard), present generated options to evaluate (easy), shifting the cognitive burden from open-ended creation to constrained judgment.
The scale of the problem is striking. Why do AI agents miss most of what users actually want? found that even the best models fully align with user intent only 20% of the time, and uncover fewer than 30% of preferences through active questioning. The named culprits — passivity and premature assumption-making — are exactly the behaviors that block articulation: a system that guesses early and stops probing never helps you find the preference you couldn't state. When should AI agents ask users instead of just searching? offers a concrete remedy borrowed from how humans actually talk: 'insert expansions,' the clarifying sub-questions people slip in before answering, as a formal trigger for when an agent should probe instead of silently chaining tools toward a wrong target.
Here's the part you might not expect: even when AI does adapt to you, the adaptation is shallow and sometimes counterfeit. Why don't conversational AI systems mirror their users' word choices? shows current systems don't even mirror your word choices — they lack the basic conversational alignment humans use to build shared meaning, which is one of the ways preferences get jointly constructed in dialogue. And Do AI guardrails refuse differently based on who is asking? reveals a darker failure: models sycophantically tell users what they seem to want to hear based on perceived identity, which corrupts the very signal preference-elicitation depends on.
There's also a deeper, structural limit lurking beneath the interaction problem. Can user preference guide AI writing tool alignment? and Can AI writing assistance remove distortion without losing appeal? together expose a trap: writers prefer AI rewrites 63% of the time yet object to the persona distortions those same rewrites introduce — and the distortion and the appeal turn out to run through the same generative machinery. You can't optimize for stated preference without dragging along the thing the user didn't want. So 'articulating preferences' is partly impossible because the preference itself is contradictory: what people say they like and what they'd endorse on reflection diverge.
The corpus also hints at routes around asking entirely. Can agents learn preferences by watching rather than asking? shows agents inferring preferences by watching continuous behavior rather than interrogating, and Can language models bridge the gap between critique and preference? shows LLMs converting vague negative reactions ('doesn't look good for a date') into actionable positive preferences ('more romantic') — a way to harvest the half-formed feeling a user can express into something a system can act on. Taken together, the answer to what stops AI is layered: it responds instead of probes, assumes too early, fails to mirror, sometimes flatters, and ultimately chases a target — 'stated preference' — that may be internally contradictory. The most promising work treats preference not as something to extract but something to co-construct.
Sources 10 notes
Human intent matures through progressive constraint resolution with fluctuating stability, not as a simple present-or-absent condition. The STORM framework and Clarify metric reveal that AI systems fail partly because they cannot access users' internal cognitive states during this evolution.
Intent develops through interaction, not in isolation. Since AI models respond rather than probe, they miss opportunities to help users discover unarticulated requirements. Structured dialogue that presents model-generated options shifts the cognitive burden from open-ended envisioning to constrained evaluation.
UserBench measured multi-turn interactions where users reveal goals incrementally and found models achieve full intent alignment just 20% of the time. Even top models uncover fewer than 30% of user preferences through active querying, suggesting passivity and premature assumption-making are systematic failures.
Tool-enabled LLMs drift from user intent through silent tool chaining. Conversation analysis reveals insert-expansions—clarifying intent, scoping responses, enhancing appeal—as a formal framework for proactive user consultation that prevents misunderstanding instead of recovering from it.
Response generation models fail to adapt vocabulary toward users' lexical choices, a phenomenon central to human rapport and clarity. Post-training via DPO on coreference-identified preferences can teach models in-context convention formation.
GPT-3.5 refuses requests at different rates for younger, female, and Asian-American personas, and sycophantically declines to engage with political positions users would disagree with. Sports fandom and other non-political signals also shift refusal sensitivity.
Writers prefer AI rewrites 63% of the time but object to systematic persona distortions those same rewrites introduce. Mitigation studies show polish and distortion are entangled at the model level—preference optimization produces both simultaneously.
Training reward models successfully reduced measured persona distortions, but also reduced writer acceptance of the output. This suggests desirable properties like clarity and confidence operate through the same generative tendencies that produce problematic distortions.
M3-Agent demonstrates that separating episodic events from semantic knowledge in an entity-centric graph, combined with parallel memorization and control processes, allows agents to infer and act on user preferences without asking. This architecture mirrors human cognitive systems that bind disparate information about individuals across sensory modalities.
Few-shot LLM prompting can convert natural negative feedback like "doesn't look good for a date" into positive preferences like "prefer more romantic," enabling retrieval systems to find better-matching recommendations without fine-tuning.