INQUIRING LINE

Inquiring lines›What do model internals reveal abo…›How do surface signals and framing…›What makes specific clarifying que…›this inquiring line

An AI should ask whichever question teaches it the most — not the one that feels most natural or polite.

How does asymmetric information shape what to ask users first?

This explores how the gap between what the system knows and what the user privately holds determines which clarifying question earns the most by being asked first.

This explores how information asymmetry — the gap between what the system already knows and what only the user knows — should drive the first question an AI asks. The corpus has a surprisingly coherent answer: ask the question whose possible answers would most reduce your uncertainty, not the one that sounds most polite or generic. The cleanest formalization is information-gain selection, where a model simulates the range of answers a candidate question could produce and picks the one that collapses the most uncertainty about what the user wants How can models select the most informative question to ask?. The same logic shows up in preference learning, where roughly ten adaptively chosen questions are enough to pin down a user's personalized reward coefficients — each question is selected precisely because it targets the dimension the system is currently least sure about Can user preferences be learned from just ten questions?.

But 'maximum information gain' isn't the whole story, because not all informative questions feel useful to the person answering. The corpus is blunt here: questions that target a concrete, foreseeable gap ('What type of monitor?') consistently beat questions that throw the burden back on the user ('What are you trying to do?') Which clarifying questions actually improve user satisfaction?. Users engage when they can see how their answer changes the result. So the asymmetry that matters isn't only the system's uncertainty — it's the user's ability to cheaply resolve it. The best first question sits where high information gain for the system overlaps with low answering cost for the user. ALFA pushes this further by breaking 'good question' into trainable attributes like specificity, relevance, and clarity, showing that optimizing those facets separately produces better clarification than chasing a single quality score — especially in clinical settings where the wrong first question changes a diagnosis Can models learn to ask genuinely useful clarifying questions?.

There's a prior question hiding underneath: when should the system ask at all instead of just acting? Tool-using agents tend to chain silent searches and drift from intent; conversation analysis offers 'insert-expansions' as a formal trigger for probing the user the moment intent is ambiguous, rather than recovering from a misread later When should AI agents ask users instead of just searching?. Asymmetry is the signal for that decision too — if the private information the user holds is decision-critical and unrecoverable by search, that's exactly when to interrupt and ask.

Why asymmetry is structurally central comes through most clearly from the failure cases. When one model secretly controls all parties in a social simulation, it looks socially competent — but that competence evaporates the moment agents hold genuinely private information, because the model was skipping the grounding work that real asymmetry forces Why do LLMs fail when simulating agents with private information?. The mirror image is pedagogical: a teacher can only correct a student because the teacher has access the student lacks; remove that gap and no learning signal exists Why does teacher-student information asymmetry enable learning signals?. Asking a user a question is the same move in reverse — the user is the one holding the privileged information, and the question is how the system extracts the gradient it can't generate alone.

Two cautions round out the picture. Adaptively learning a single user's preferences can curdle into sycophancy and echo chambers once the averaging effect of aggregate models is gone, so the questions you ask to personalize can quietly optimize for agreement rather than truth Does personalizing reward models amplify user echo chambers?. And the channel itself is leaky: identical questions get measurably different answers depending on the emotional tone of the framing, which means how you ask shapes the information you get back, not just what you ask Does emotional tone in prompts change what information LLMs provide?. The takeaway a curious reader might not expect: the first question isn't a courtesy or a search fallback — it's the system deliberately locating the one place where the user knows something it can't, and where the user can tell it cheaply.

Sources 9 notes

How can models select the most informative question to ask?

UoT combines uncertainty-aware scenario simulation with information-gain scoring and reward propagation to identify questions whose possible answers maximally reduce diagnostic uncertainty—providing a principled mechanism for specific, high-value clarification rather than generic prompts.

Can user preferences be learned from just ten questions?

PReF learns base reward functions from preference data, then uses active learning to select maximally informative questions that reduce coefficient uncertainty. Users can be personalized via inference-time reward alignment without weight modification.

Which clarifying questions actually improve user satisfaction?

Clarifying questions that target concrete information gaps ("What type of monitor?") consistently beat those that ask users to rephrase their needs ("What are you trying to do?"). Users engage most when they can foresee how answering improves results.

Can models learn to ask genuinely useful clarifying questions?

The ALFA framework breaks down question quality into theory-grounded attributes (clarity, relevance, specificity) and trains models on 80K attribute-specific preference pairs. Attribute-specific optimization outperforms single-score training, especially in clinical reasoning where asking the right clarifying question directly impacts decision quality.

When should AI agents ask users instead of just searching?

Tool-enabled LLMs drift from user intent through silent tool chaining. Conversation analysis reveals insert-expansions—clarifying intent, scoping responses, enhancing appeal—as a formal framework for proactive user consultation that prevents misunderstanding instead of recovering from it.

Show all 9 sources

Why do LLMs fail when simulating agents with private information?

Research shows LLMs perform well when one model controls all interlocutors but fail systematically when agents possess private information. This reveals that apparent social competence relies on grounding work that models skip in omniscient settings.

Why does teacher-student information asymmetry enable learning signals?

Social meta-learning requires information asymmetry—the teacher's access to correct answers or verifier output—to generate meaningful corrective signals. Without this asymmetry, teacher and student share identical uncertainty, making pedagogical correction impossible.

Does personalizing reward models amplify user echo chambers?

Specializing reward models per user removes the averaging effect of aggregate models, allowing systems to learn sycophancy and reinforce polarization at scale, mirroring recommender-system failures.

Does emotional tone in prompts change what information LLMs provide?

GPT-4 exhibits emotional rebound (negative prompts yield ~86% neutral-positive responses) and a tone floor (positive prompts rarely go negative), causing identical questions to receive different answers depending on emotional framing. This bias is suppressed only on sensitive topics where alignment constraints override tone effects.

Papers this line draws on 8

The research behind the notes this line reads — ranked by how closely each paper relates.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing whether information-asymmetry-driven question selection still holds as a design principle for LLM-user interaction. The question: **When and how should an AI choose its opening question to a user?**

What a curated library found — and when (dated claims, not current truth):
Findings span 2023–2026, but treat all as perishable:
• Information-gain maximization (choosing questions that collapse the most model uncertainty) works in principle, but real users prefer *concrete, foreseeable* questions over open-ended ones; satisfaction drops measurably on vague probes (2024-02).
• ~10 adaptively chosen questions suffice to factor personalized user preferences into linear reward coefficients; each question targets the dimension the system is least certain about (2025-03).
• Question quality decomposes into trainable facets (specificity, relevance, clarity) rather than a single metric; multi-objective optimization outperforms single-score chasing in clinical diagnosis (2025-02).
• Models simulate social reasoning well only when they control all parties; competence *collapses* under genuine information asymmetry because silent search cannot recover private user intent (2025-03).
• Emotional framing (tone, affect) measurably shifts answers to *identical* questions; the channel itself biases the information returned (2025-06).

Anchor papers (verify; mind their dates):
• arXiv:2402.01934 — Clarifying the Path to User Satisfaction (2024-02)
• arXiv:2403.05020 — Simulating Social Interaction Under Asymmetry (2025-03)
• arXiv:2502.14860 — ALFA: Asking Good Questions in Clinical Reasoning (2025-02)
• arXiv:2503.06358 — Reward Factorization for Personalization (2025-03)

Your task:
(1) **Re-test each constraint.** For concrete/foreseeable questions vs. open-ended: do newer retrieval-augmented, agentic, or multi-turn architectures now make vague probes *work*? Has improved user modeling via memory or thought-tracing systems (e.g., 2026-02) changed the trade-off? Separate the durable insight (asymmetry is decision-critical) from the perishable claim (specificity always wins). Cite what changed it.
(2) **Surface the strongest contradicting or superseding work** from the last ~6 months. Look especially for papers on exploratory questioning, Socratic dialogue, or adversarial/high-stakes question design that *reject* information-gain framing.
(3) **Propose 2 research questions that assume the regime moved:**
   • If agentic systems can now silently gather context via tool chains before asking, does *when* to ask remain as asymmetry-driven, or does it shift to resource cost?
   • If emotional/stylistic framing now systematically biases answers, should the first question include *meta-directivesabout truthfulness*?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

An AI should ask whichever question teaches it the most — not the one that feels most natural or polite.

Related lines of inquiry

Sources 9 notes

Papers this line draws on 8