SYNTHESIS NOTE
Psychology, Society, and Alignment Reasoning, Retrieval, and Evaluation Agentic Systems and Tool Use

How can models select the most informative question to ask?

Explores whether simulating possible futures and scoring questions by information gain can identify which clarifying question would best reduce uncertainty—moving beyond just deciding whether to ask toward deciding what to ask.

Synthesis note · 2026-02-22 · sourced from Question Answer Search
Why do AI agents fail to take initiative? How should we allocate compute budget at inference time? How should researchers navigate LLM reasoning research?

Most work on clarifying questions addresses WHETHER to ask. Uncertainty of Thoughts (UoT) addresses WHAT to ask — and provides a principled, information-theoretic mechanism for selecting the optimal question.

The algorithm has three components working together:

  1. Uncertainty-aware simulation: the model generates multiple candidate questions, then simulates possible future scenarios for each — what might the user answer, and what would each answer imply? These simulations form a tree structure of possible futures.

  2. Information-gain rewards: each simulated path is scored by how much it reduces the model's uncertainty about the true answer. Questions whose possible answers would maximally distinguish between remaining possibilities score highest.

  3. Reward propagation: expected rewards are computed across all simulated futures, allowing selection of the question with highest expected information gain — the one that, on average across possible answers, most reduces uncertainty.

The medical diagnosis framing makes the mechanism concrete: a patient doesn't report full symptoms. The doctor must decide which question to ask next. A question like "Do you have a fever?" partitions the diagnostic space differently than "Have you traveled recently?" UoT formalizes this: given the current possibility set (diseases consistent with reported symptoms so far), which question's possible answers would most effectively narrow that set?

This connects directly to proactive critical thinking. Since Can models learn to ask clarifying questions instead of guessing?, the gap that proactive critical thinking fills is DETECTING incompleteness. UoT fills the complementary gap: once incompleteness is detected, SELECTING the most informative question to ask. And since Which clarifying questions actually improve user satisfaction?, UoT provides the mechanism for generating specific-facet questions rather than generic "can you be more specific?" prompts — the information-gain criterion naturally selects for questions that target the highest-value information asymmetry.

The connection to test-time scaling is architectural: UoT is essentially test-time compute applied to question generation. The simulation-propagation loop trades inference-time computation for better question selection, analogous to how reasoning models trade computation for better answers. Since Can dialogue planning balance fast responses with strategic depth?, UoT's simulation-propagation loop could serve as the System 2 question-selection mechanism within dual-process dialogue planning -- when uncertainty triggers the MCTS planner, information-gain scoring provides a principled criterion for which clarifying question to generate next. And since Can tree search replace human feedback in LLM training?, UoT's reward propagation across simulated futures is structurally analogous to MCTS backpropagation -- both use tree search to extract quality signals from exploration of future states.

Inquiring lines that use this note as a source 17

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 6

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
16 direct connections · 156 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

uncertainty-aware question selection via information gain simulates possible futures to determine the optimal next question to ask