INQUIRING LINE

How do users fail to articulate what they actually want?

This explores the gap between what users say they want and what they actually need from AI — and the research suggests the failure isn't laziness or vagueness on the user's part, but that intent itself doesn't exist fully formed until it's drawn out through interaction.


This explores the gap between what users say they want and what they actually need from AI. The most useful reframe in the corpus is that users don't fail to articulate intent because they're being careless — they fail because intent isn't a fixed thing sitting in their head waiting to be typed out. How do users actually form intent when prompting AI systems? argues intent matures through progressive constraint resolution, with stability that fluctuates along the way. You don't *have* the requirement and then describe it; you discover it by bumping into options. Why can't users articulate what they want from AI? names this the 'gulf of envisioning' and points to the real culprit: AI models respond rather than probe, so they never help users do the discovery work. The fix it proposes is counterintuitive — shift the user from open-ended envisioning (hard) to constrained evaluation of model-generated options (easy).

The scale of the problem is striking once measured. Why do AI agents miss most of what users actually want? found that even top models fully capture user intent only 20% of the time, and uncover fewer than 30% of preferences even when actively querying. The dominant failure modes are passivity and premature assumption-making — the model guesses early and stops listening. So part of why users 'fail' to articulate is that the system gives up on eliciting before the user has finished forming the thought.

Not all probing is equal, which is where this gets practical. Which clarifying questions actually improve user satisfaction? shows that asking 'What type of monitor?' beats asking 'What are you trying to do?' — concrete, facet-level questions help, while asking users to re-explain their need just hands the burden back. There's even formal grounding for *when* to ask: When should AI agents ask users instead of just searching? borrows the idea of 'insert expansions' from conversation analysis to mark the moments an agent should pause and scope intent rather than silently chaining tools toward the wrong target.

Here's the part you didn't know you wanted to know: users often can't even tell *you* — or themselves — when articulation has failed. Does user satisfaction actually measure cognitive understanding? found people report satisfaction while remaining internally confused, especially when they're unaware of their own knowledge gaps. So satisfaction surveys mask the problem. Why do users drift away from their original information need? goes deeper to a classic information-science idea: people search precisely because they're in an 'anomalous state of knowledge' — they can't specify what they need *because not knowing it is the whole reason they're asking.* Articulation failure isn't a bug at the edge; it's the starting condition.

And the medium actively misleads. Why do users fail with AI interfaces designed like conversations? argues that conversational interfaces trigger users' lifelong communication instincts — instincts built for talking to someone who shares context and infers what you mean — while the AI isn't actually communicating in that sense. The mismatch produces failures that feel like user error but originate in the design. So the through-line across the corpus: users 'fail' to articulate because intent is emergent, the interface pretends otherwise, and neither side reliably notices when the gap is still open.


Sources 8 notes

How do users actually form intent when prompting AI systems?

Human intent matures through progressive constraint resolution with fluctuating stability, not as a simple present-or-absent condition. The STORM framework and Clarify metric reveal that AI systems fail partly because they cannot access users' internal cognitive states during this evolution.

Why can't users articulate what they want from AI?

Intent develops through interaction, not in isolation. Since AI models respond rather than probe, they miss opportunities to help users discover unarticulated requirements. Structured dialogue that presents model-generated options shifts the cognitive burden from open-ended envisioning to constrained evaluation.

Why do AI agents miss most of what users actually want?

UserBench measured multi-turn interactions where users reveal goals incrementally and found models achieve full intent alignment just 20% of the time. Even top models uncover fewer than 30% of user preferences through active querying, suggesting passivity and premature assumption-making are systematic failures.

Which clarifying questions actually improve user satisfaction?

Clarifying questions that target concrete information gaps ("What type of monitor?") consistently beat those that ask users to rephrase their needs ("What are you trying to do?"). Users engage most when they can foresee how answering improves results.

When should AI agents ask users instead of just searching?

Tool-enabled LLMs drift from user intent through silent tool chaining. Conversation analysis reveals insert-expansions—clarifying intent, scoping responses, enhancing appeal—as a formal framework for proactive user consultation that prevents misunderstanding instead of recovering from it.

Does user satisfaction actually measure cognitive understanding?

STORM shows users express satisfaction despite internal confusion, especially when unaware of knowledge gaps. Sustained engagement correlates with actual self-understanding, not immediate satisfaction ratings.

Why do users drift away from their original information need?

Belkin & Vickery's anomalous state of knowledge explains why users pursuing one information need gradually deviate into sub-topics. Topic shift detection models identify this drift with 84% precision without predetermined topic sets.

Why do users fail with AI interfaces designed like conversations?

AI interfaces that use conversational design conventions trigger users' lifelong communication skills, but AI doesn't actually communicate. This mismatch causes interaction failures that feel like user error but originate in design.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst tracking how user intent formation actually works in AI-mediated interaction. The question is still open: *Why do users struggle to articulate what they want—and when can an AI system help rather than hinder that discovery?*

What a curated library found — and when (dated claims, not current truth):
Findings span July 2023–December 2025. Key constraints reported:
- Even top models fully capture user intent only ~20% of the time, missing >70% of preferences even when actively querying (2024).
- Concrete, facet-level clarifying questions ('What type of monitor?') outperform open-ended re-explanation prompts ('What are you trying to do?') (2024).
- Users report satisfaction while remaining internally confused; anomalous-state-of-knowledge (not knowing what you need) is the *starting condition*, not a rare edge case (2023–2024).
- Conversational interfaces trigger communication instincts built for human interaction—shared context, inference, repair—that don't apply to stateless LLMs, producing systematic misalignment (2023).
- Formal framework from conversation analysis (insert-expansions) identifies specific moments when agents should pause to scope intent rather than assume and chain tools (2023).

Anchor papers (verify; mind their dates):
- 2309.14459 (Bridging the gulf of envisioning, Sept 2023)
- 2402.01934 (Clarifying the path to user satisfaction, Feb 2024)
- 2507.21919 (Warm/empathetic training degrades reliability, July 2025)
- 2507.22034 (UserBench interactive gym, July 2025)

Your task:
(1) RE-TEST the 20% intent-capture ceiling and passive-assumption failure modes. Have multi-turn agentic systems, structured reasoning (chain-of-thought, tree-search), or memory/context windows since relaxed this constraint? Separately, test whether conversational-interface mismatch persists or whether recent UI/orchestration changes have narrowed the gulf. State plainly what still holds.
(2) Surface contradicting or superseding work from the last 6 months—especially papers on emotional framing (2507.21083), tone-reading (2507.21083), or grounding under uncertainty (2506.08952)—that may show the problem is *worse* (sycophancy masks intent gaps) or *different* (emotional/tonal signals now obscure rather than clarify intent).
(3) Propose 2 research questions that assume the regime may have shifted: (a) Do agentic systems with explicit-confirmation checkpoints recover hidden preferences, or do checkpoints themselves corrupt user articulation? (b) Can LLMs reliably detect their own intent-capture failure in real time, and does such meta-awareness change user behavior?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines