SYNTHESIS NOTE

How can models select the most informative question to ask?

Explores whether simulating possible futures and scoring questions by information gain can identify which clarifying question would best reduce uncertainty—moving beyond just deciding whether to ask toward deciding what to ask.

Synthesis note · 2026-02-22 · sourced from Question Answer Search

Most work on clarifying questions addresses WHETHER to ask. Uncertainty of Thoughts (UoT) addresses WHAT to ask — and provides a principled, information-theoretic mechanism for selecting the optimal question.

The algorithm has three components working together:

Uncertainty-aware simulation: the model generates multiple candidate questions, then simulates possible future scenarios for each — what might the user answer, and what would each answer imply? These simulations form a tree structure of possible futures.
Information-gain rewards: each simulated path is scored by how much it reduces the model's uncertainty about the true answer. Questions whose possible answers would maximally distinguish between remaining possibilities score highest.
Reward propagation: expected rewards are computed across all simulated futures, allowing selection of the question with highest expected information gain — the one that, on average across possible answers, most reduces uncertainty.

The medical diagnosis framing makes the mechanism concrete: a patient doesn't report full symptoms. The doctor must decide which question to ask next. A question like "Do you have a fever?" partitions the diagnostic space differently than "Have you traveled recently?" UoT formalizes this: given the current possibility set (diseases consistent with reported symptoms so far), which question's possible answers would most effectively narrow that set?

This connects directly to proactive critical thinking. Since Can models learn to ask clarifying questions instead of guessing?, the gap that proactive critical thinking fills is DETECTING incompleteness. UoT fills the complementary gap: once incompleteness is detected, SELECTING the most informative question to ask. And since Which clarifying questions actually improve user satisfaction?, UoT provides the mechanism for generating specific-facet questions rather than generic "can you be more specific?" prompts — the information-gain criterion naturally selects for questions that target the highest-value information asymmetry.

The connection to test-time scaling is architectural: UoT is essentially test-time compute applied to question generation. The simulation-propagation loop trades inference-time computation for better question selection, analogous to how reasoning models trade computation for better answers. Since Can dialogue planning balance fast responses with strategic depth?, UoT's simulation-propagation loop could serve as the System 2 question-selection mechanism within dual-process dialogue planning -- when uncertainty triggers the MCTS planner, information-gain scoring provides a principled criterion for which clarifying question to generate next. And since Can tree search replace human feedback in LLM training?, UoT's reward propagation across simulated futures is structurally analogous to MCTS backpropagation -- both use tree search to extract quality signals from exploration of future states.

Inquiring lines that read this note 18

This note is a source for these research framings, grouped by the broader line of inquiry each explores. Scan the bold lines of inquiry; follow any specific question forward.

What makes dialogue-based explanation more successful than monologue?

How do humans decide which level of clarification to request?

How can models identify insufficient information and respond appropriately without guessing?

What makes specific clarifying questions more effective than generic ones?

How should models express uncertainty rather than forced confident answers?

Why do readers trust citations and complexity regardless of accuracy?

How do experts decide which information matters for a specific audience?

Does reinforcement learning teach reasoning or just when to reason?

Can reinforcement learning teach AI when to ask clarifying questions?

What makes weaker teacher models effective for stronger student training?

Can we cheaply estimate which samples are currently most informative?

Can model confidence signals reliably improve reasoning quality and calibration?

Can imperfect uncertainty estimates still beat uniform oversight strategies?

Can alternative training methods improve on supervised fine-tuning for language models?

Can information-gain principles improve how we choose what to label?

How does AI adoption affect human skill development and labor equality?

How should forecasting methods adapt to a post-AGI regime?

Related concepts in this collection 6

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

16 direct connections · 144 in 2-hop network ·dense cluster Open in graph ↗

How can models select the most informative quest… Can models learn to ask clarifying questions inste… Which clarifying questions actually improve user s… Can AI agents communicate efficiently in joint dec… When should AI agents ask users instead of just se… Can dialogue planning balance fast responses with … Can tree search replace human feedback in LLM trai…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Can models learn to ask clarifying questions instead of guessing? Exploring whether large language models can be trained to detect incomplete queries and actively request missing information rather than hallucinating answers or refusing to respond. This matters because conversational agents today remain passive, responding only when prompted.
UoT provides the selection mechanism that proactive critical thinking needs: once missing information is detected, which question recovers it fastest
Which clarifying questions actually improve user satisfaction? Not all clarification helps equally. This explores whether asking users to rephrase their needs works as well as asking targeted questions about specific information gaps.
information-gain criterion naturally selects specific-facet questions over generic rephrasing
Can AI agents communicate efficiently in joint decision problems? When humans and AI must collaborate to solve optimization problems under asymmetric information, what communication patterns enable effective coordination? Current LLMs struggle with this—why?
UoT operationalizes the asymmetric information problem: simulate what the user might know, ask what most reduces the asymmetry
When should AI agents ask users instead of just searching? Explores whether tool-enabled LLMs should probe users for clarification when uncertain, rather than silently chaining tool calls that drift from intent. Examines conversation analysis patterns as a formal alternative.
UoT provides the selection mechanism for which insert-expansion to use
Can dialogue planning balance fast responses with strategic depth? Can a system use quick instinctive responses for familiar conversation contexts while activating deeper planning only when uncertainty demands it? This explores whether adaptive computation improves dialogue goal-reaching.
UoT's simulation loop could serve as the System 2 question-selection mechanism when uncertainty triggers MCTS planning
Can tree search replace human feedback in LLM training? Explores whether Monte Carlo Tree Search can generate quality signals for self-improvement without expensive human annotations. Matters because annotation bottlenecks currently limit LLM scaling.
structural analogy: UoT's reward propagation across simulated futures parallels MCTS backpropagation of quality signals

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

uncertainty-aware question selection via information gain simulates possible futures to determine the optimal next question to ask

How can models select the most informative question to ask?

Inquiring lines that read this note 18

Related concepts in this collection 6

Related papers in this collection 8

Search by related questions 4