INQUIRING LINE

Can language models ask clarifying questions when sentences are ambiguous?

This explores whether LLMs can detect an ambiguous or underspecified sentence and respond by asking a clarifying question — and what the corpus says about why that behavior is rare and how it can be trained.


This explores whether LLMs can spot an ambiguous or underspecified sentence and ask a clarifying question rather than just guessing — and the corpus splits the problem into two surprising halves: recognizing the ambiguity, and acting on it. On recognition, the picture is bleak. The AMBIENT benchmark finds GPT-4 correctly untangles deliberately ambiguous sentences only 32% of the time, against 90% for humans, across lexical, structural, and scope ambiguity Can language models recognize when text is deliberately ambiguous?. The deeper issue isn't vocabulary — it's that the model struggles to hold two readings of a sentence in mind at once. And being a strong reasoner doesn't rescue you: models that ace fully-specified problems collapse to 40-50% when one needed variable is withheld and the task becomes 'figure out what to ask' Can models identify what information they actually need?. Asking good questions is a separate skill from answering well.

So why don't models just ask? Part of the answer is how they're trained. Standard RLHF rewards immediate helpfulness on the very next turn, which quietly teaches models to barrel ahead and answer rather than pause to ask — CollabLLM shows that swapping in rewards that estimate the long-term value of an exchange flips this, unlocking active intent discovery Why do language models respond passively instead of asking clarifying questions?. There's also a hidden cheat: many models look like they're reasoning about an underspecified problem when they're really just defaulting to the safe, conservative option — twelve of fourteen models actually got *worse* when constraints were removed Are models actually reasoning about constraints or just defaulting conservatively?. Apparent caution can be a reflex, not genuine recognition that something is missing.

The encouraging news is that the clarifying instinct is teachable. Reinforcement learning pushed proactive 'wait, this problem is flawed' accuracy from essentially zero (0.15%) to 74% on deliberately broken math problems — though the ability stayed fragile, and simply letting an untrained model think longer made it worse, not better Can models learn to ask clarifying questions instead of guessing?. More intriguingly, social meta-learning produces the behavior as an emergent side effect: train models only on complete problems, and they generalize to underspecified ones by spontaneously asking for what they need and delaying their answer Can models learn to ask clarifying questions without explicit training?. The model learns to treat the conversation itself as a place to gather information.

But asking *a* question isn't the same as asking a *good* one. The ALFA framework breaks question quality into concrete attributes — clarity, relevance, specificity — and trains on attribute-specific preferences rather than a single 'good/bad' score; this matters most in high-stakes settings like clinical reasoning, where the right clarifying question directly changes the decision Can models learn to ask genuinely useful clarifying questions?. Underlying all of this is a calibration question: a model only asks when it knows it doesn't know. Small models trained with uncertainty-aware objectives and an option to abstain can match models ten times their size, which suggests the 'sense of not knowing' that should trigger a clarifying question exists in LLMs but is badly undertrained by default Can models learn to abstain when uncertain about predictions?.

The thread worth pulling: clarifying questions sit at the intersection of three abilities we usually measure separately — noticing ambiguity, knowing you're uncertain, and valuing a future turn over the present one. Standard training actively suppresses all three. So the honest answer is 'not by default, but yes when trained for it' — and what gets trained is less a new skill than permission to stop guessing.


Sources 8 notes

Can language models recognize when text is deliberately ambiguous?

AMBIENT benchmark shows GPT-4 correctly disambiguates only 32% of cases versus 90% for humans. This failure spans lexical, structural, and scope ambiguity—revealing that LLMs cannot hold multiple interpretations simultaneously, a fundamental gap hidden by standard benchmarks.

Can models identify what information they actually need?

Models achieving high accuracy on complete reasoning tasks drop to 40-50% accuracy identifying what clarifying question to ask when one variable is withheld. Information gathering and problem execution are separable cognitive operations.

Why do language models respond passively instead of asking clarifying questions?

CollabLLM demonstrates that standard RLHF training optimizes for immediate helpfulness, discouraging models from asking clarifying questions or offering multi-turn insights. Multi-turn-aware rewards that estimate long-term interaction value enable active intent discovery and genuine collaboration.

Are models actually reasoning about constraints or just defaulting conservatively?

Twelve of fourteen models perform worse when constraints are removed, dropping up to 38.5 percentage points. Models appear to reason correctly by defaulting to harder options, not by actually evaluating constraints.

Can models learn to ask clarifying questions instead of guessing?

Reinforcement learning training increased proactive critical thinking accuracy from 0.15% to 73.98% on deliberately flawed math problems. Notably, inference-time scaling degraded this ability in untrained models but improved it after RL training, suggesting the capability is learnable but fragile without explicit training.

Can models learn to ask clarifying questions without explicit training?

Models trained via SML on complete problems generalize to underspecified tasks by asking for needed information and delaying answers. The training paradigm instills a meta-strategy of using conversation as an information source, addressing the premature-answering failure mode.

Can models learn to ask genuinely useful clarifying questions?

The ALFA framework breaks down question quality into theory-grounded attributes (clarity, relevance, specificity) and trains models on 80K attribute-specific preference pairs. Attribute-specific optimization outperforms single-score training, especially in clinical reasoning where asking the right clarifying question directly impacts decision quality.

Can models learn to abstain when uncertain about predictions?

Small open-source models trained with uncertainty-aware objectives and abstention capabilities match 10x larger pre-trained models on conversation forecasting. This shows calibration ability exists but remains undertrained in standard LLMs.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are an AI researcher auditing whether language models can recognize ambiguity and ask clarifying questions — treating this as an unsolved frontier, not a closed problem.

What a curated library found — and when (dated claims, not current truth): Papers spanning 2023–2026 identified three core constraints:
• Recognition bottleneck: GPT-4 disambiguates only 32% of deliberately ambiguous sentences vs. 90% for humans; strong reasoning on fully-specified problems does not transfer to spotting missing information (2023–2024).
• Training penalty: standard RLHF rewards immediate answers, not future-turn value; models default to safe conservative bias rather than genuine ambiguity-sensing (2025–2026).
• Teachability: RL targeting proactive critique lifts detection from 0.15% to 74% on broken problems; social meta-learning produces clarifying questions as emergent behavior when trained only on complete examples (2026).
• Question quality separates from asking: ALFA framework decomposes clarity, relevance, specificity; matters most in high-stakes domains like clinical reasoning (2025).
• Calibration gap: small models with uncertainty-aware objectives + abstention option match 10× larger models, suggesting 'sense of not knowing' exists but is undertrained by default (2024–2025).

Anchor papers (verify; mind their dates):
• arXiv:2304.14399 (2023) — ambient ambiguity recognition failure
• arXiv:2502.14860 (2025) — clinical reasoning case study on question quality
• arXiv:2602.16488 (2026) — social meta-learning emergence
• arXiv:2602.07338 (2026) — multi-turn intent mismatch

Your task:
(1) RE-TEST each constraint: (a) Does improved instruction-tuning or chain-of-thought scaffolding now help GPT-4+ reliably hold multiple readings of an ambiguous sentence? (b) Have newer RLHF variants (e.g., process reward models, outcome+process hybrid) actually shifted the incentive away from barrel-ahead defaults? (c) Are the 74% gains from RL actually durable in deployment, or do they collapse on out-of-distribution underspecified problems? Distinguish what is truly solved from what merely shifted the bar.
(2) Surface the strongest contradicting or superseding work from the last 6 months: Has a recent paper shown that frontier models (o1, Claude-4, Gemini-3) already exhibit spontaneous clarifying behavior without retraining, or that a simpler method (e.g., prompt engineering, external ambiguity detector) outpaces the RL approaches documented here?
(3) Propose 2 research questions assuming the regime has moved: (a) If models now routinely ask clarifying questions, what determines when they ask too many vs. too few — is there a principled stopping rule? (b) Do clarifying-question capabilities transfer across domains (e.g., from math to clinical to legal), or are they domain-specific learned reflexes?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines