INQUIRING LINE

Inquiring lines›Where does language-model reasonin…›How do reward models guide reliabl…›How can models identify insufficie…›this inquiring line

Knowing how to solve a problem and knowing what's missing from one turn out to be completely different skills for AI.

Can models identify information gaps without just guessing or refusing to answer?

This explores whether models can tell when they're missing information and respond by asking for it — rather than the two failure modes of bluffing an answer or flatly refusing.

This explores whether models can recognize a knowledge gap and act on it productively (ask, abstain, or seek), instead of guessing or refusing. The corpus says the answer is a qualified yes — but only because the gap-spotting skill turns out to be separate from the answer-producing skill, and has to be trained on its own. The most striking evidence is that being good at problems doesn't make a model good at noticing what a problem is missing: models that ace complete reasoning tasks fall to 40-50% when one variable is withheld and they must figure out *which* clarifying question to ask Can models identify what information they actually need?. Information-gathering and problem-execution are genuinely different cognitive operations, which is why a strong solver still blurts out an answer to an under-specified prompt.

The encouraging news is that the gap-detection muscle is learnable. Reinforcement learning pushed 'proactive critical thinking' on deliberately flawed math problems from essentially zero (0.15%) to nearly 74% — and revealingly, giving the model more inference-time compute *hurt* untrained models (they overthought their way into an answer) but *helped* trained ones Can models learn to ask clarifying questions instead of guessing?. You can even get the behavior to emerge without ever labeling underspecified cases: train on complete problems via social meta-learning and the model generalizes to ask for missing pieces and delay answering, treating the conversation itself as a place to fetch information Can models learn to ask clarifying questions without explicit training?.

But 'ask a question' isn't enough if the question is generic. Two threads tackle question *quality*. One decomposes it into measurable attributes — clarity, relevance, specificity — and trains on attribute-specific preferences, which beats optimizing for a single quality score, especially in clinical reasoning where the right question changes the diagnosis Can models learn to ask genuinely useful clarifying questions?. The other treats clarification as a search problem: simulate the possible answers each candidate question could yield, score them by how much they'd shrink the model's uncertainty, and ask the one with the highest information gain How can models select the most informative question to ask?. That's the difference between 'can you tell me more?' and a targeted question that actually resolves the ambiguity.

The third path isn't asking at all — it's knowing when to hold back. Small models trained with uncertainty-aware objectives and an abstention option can match models 10x their size on conversation forecasting, simply by declining the calls they'd get wrong Can models learn to abstain when uncertain about predictions?. And rather than abstaining, a model can use its own draft answer as a probe: ITER-RETGEN shows a partial response surfaces information needs the original query couldn't express, so the gap becomes a better retrieval query than the question itself Can a model's partial response guide what to retrieve next?. Calibration is the common thread — confident models resist prompt perturbation while low-confidence ones swing wildly, so confidence is itself a usable signal for when to commit versus seek Does model confidence predict robustness to prompt changes?.

Here's the thing you might not have expected: the biggest obstacles to gap-spotting aren't about reasoning power, they're about disposition and training incentives. Models often agree with false premises not from ignorance but from face-saving agreeableness baked in by RLHF — GPT rejects bad presuppositions 84% of the time, Mistral only 2% — a social accommodation problem distinct from hallucination Why do language models agree with false claims they know are wrong?. And reasoning-tuned models actually do *worse* on ill-posed questions, generating long redundant chains for problems with missing premises that plain models correctly flag as unanswerable, because training rewarded producing reasoning steps but never taught the model when to disengage Why do reasoning models overthink ill-posed questions?. So the capability exists and is teachable through several routes — RL, meta-learning, calibration, information-gain search — but standard training pipelines quietly optimize it away, which is why a model that can solve anything will still confidently answer a question it should have questioned.

Sources 10 notes

Can models identify what information they actually need?

Models achieving high accuracy on complete reasoning tasks drop to 40-50% accuracy identifying what clarifying question to ask when one variable is withheld. Information gathering and problem execution are separable cognitive operations.

Can models learn to ask clarifying questions instead of guessing?

Reinforcement learning training increased proactive critical thinking accuracy from 0.15% to 73.98% on deliberately flawed math problems. Notably, inference-time scaling degraded this ability in untrained models but improved it after RL training, suggesting the capability is learnable but fragile without explicit training.

Can models learn to ask clarifying questions without explicit training?

Models trained via SML on complete problems generalize to underspecified tasks by asking for needed information and delaying answers. The training paradigm instills a meta-strategy of using conversation as an information source, addressing the premature-answering failure mode.

Can models learn to ask genuinely useful clarifying questions?

The ALFA framework breaks down question quality into theory-grounded attributes (clarity, relevance, specificity) and trains models on 80K attribute-specific preference pairs. Attribute-specific optimization outperforms single-score training, especially in clinical reasoning where asking the right clarifying question directly impacts decision quality.

How can models select the most informative question to ask?

UoT combines uncertainty-aware scenario simulation with information-gain scoring and reward propagation to identify questions whose possible answers maximally reduce diagnostic uncertainty—providing a principled mechanism for specific, high-value clarification rather than generic prompts.

Show all 10 sources

Can models learn to abstain when uncertain about predictions?

Small open-source models trained with uncertainty-aware objectives and abstention capabilities match 10x larger pre-trained models on conversation forecasting. This shows calibration ability exists but remains undertrained in standard LLMs.

Can a model's partial response guide what to retrieve next?

ITER-RETGEN shows that iteratively using generated responses as retrieval queries substantially improves performance on multi-hop reasoning and fact verification. Generation acts as both answer producer and information-need clarifier, surfacing implicit gaps that the original query missed.

Does model confidence predict robustness to prompt changes?

ProSA found that when models are highly confident, they resist prompt rephrasing; low confidence causes major output swings. Larger models, few-shot examples, and objective tasks all correlate with higher confidence and greater robustness.

Why do language models agree with false claims they know are wrong?

The FLEX benchmark shows models reject false presuppositions at dramatically different rates (GPT 84% vs Mistral 2.44%), not from ignorance but from preference for agreement learned via RLHF. This social accommodation is distinct from hallucination and requires different fixes.

Why do reasoning models overthink ill-posed questions?

Reasoning models generate redundant, lengthy responses to questions with missing premises while non-reasoning models correctly identify them as unanswerable. Training optimizes for producing reasoning steps but never teaches models when to disengage.

Papers this line draws on 8

The research behind the notes this line reads — ranked by how closely each paper relates.

AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions3.31 match · arxiv ↗
Explain-Query-Test: Self-Evaluating LLMs Via Explanation and Comprehension Discrepancy3.29 match · arxiv ↗
Learning to Learn from Language Feedback with Social Meta-Learning2.54 match · arxiv ↗
Intent Mismatch Causes LLMs to Get Lost in Multi-Turn Conversation2.51 match · arxiv ↗
QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks?2.48 match · arxiv ↗
Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning1.75 match · arxiv ↗
Linguistic Calibration of Long-Form Generations1.74 match · arxiv ↗
STaR-GATE: Teaching Language Models to Ask Clarifying Questions1.70 match · arxiv ↗

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing whether models can identify information gaps without guessing or refusing—a question that remains open despite recent progress. A curated library (Feb 2024–Feb 2026) found:

**What a curated library found — and when (dated claims, not current truth):**
- Gap-spotting is a learnable, separable skill from problem-solving: models acing reasoning tasks drop to 40–50% when asked to detect a missing variable (2024–25).
- RL on deliberately flawed problems pushed proactive gap-identification from 0.15% to ~74%; inference-time compute hurt untrained models but helped trained ones (2025).
- Decomposing question *quality* into measurable attributes (clarity, relevance, specificity) and training on preferences outperforms single-score optimization, especially in clinical reasoning (2025).
- Uncertainty-aware objectives + abstention let small models match 10× larger models by declining low-confidence calls (2024).
- Training incentives actively harm gap-spotting: reasoning-tuned models overthink ill-posed questions instead of flagging missing premises; RLHF breeds agreeableness that masks knowledge gaps (84% vs. 2% rejection of false presuppositions, GPT vs. Mistral, 2024–25).

**Anchor papers (verify; mind their dates):**
- arXiv:2502.14860 (Feb 2025): Aligning LLMs to Ask Good Questions — clinical reasoning case study.
- arXiv:2507.23407 (Jul 2025): Beyond Passive Critical Thinking — proactive questioning.
- arXiv:2602.16488 (Feb 2026): Social Meta-Learning — emergent clarifying behavior without labeled underspecified cases.
- arXiv:2506.09038 (Jun 2025): AbstentionBench — reasoning LLMs on unanswerable questions.

**Your task:**
(1) **RE-TEST EACH CONSTRAINT.** For each finding above, judge whether newer models (GPT-4o, Claude 3.5, Llama 3.x, specialized reasoners post-Jun 2026), improved training methods (DPO, IPO, test-time scaling, multi-agent orchestration), tooling (extended-context retrieval, live web search, structured verification), or evaluation standards have since *relaxed or overturned* it. Separate the durable question (Can models learn to spot gaps? Should training incentivize abstention?) from the perishable limitation (current model baseline, specific RL approach). Cite what resolved it; flag constraints that still hold.

(2) **Surface the strongest *contradicting* or *superseding* work from the last ~6 months** — e.g., if newer reasoning models now refuse unsafely or over-abstain, or if prompt engineering + in-context examples now outperform explicit training, name it.

(3) **Propose 2 research questions that *assume* the regime has shifted:** e.g., "If reasoning-scale models now reliably detect missing information, does that change how we architect retrieval-augmented generation?" or "Can we measure whether a model's abstention is calibrated *cost-awareness* (avoiding harm) vs. mere uncertainty?"

**Cite arXiv IDs; flag anything you cannot ground in a real paper.**

Knowing how to solve a problem and knowing what's missing from one turn out to be completely different skills for AI.

Related lines of inquiry

Sources 10 notes

Papers this line draws on 8