INQUIRING LINE

Inquiring lines›What makes reasoning better — more…›What limits conversational AI effe…›How should dialogue recommender sy…›this inquiring line

Could a handful of smart follow-up questions help you discover preferences you couldn't have stated on your own?

How can insert-expansion techniques help users discover their own preferences?

This reads 'insert-expansion' as the conversational move where a system pauses to ask a clarifying sub-question before responding — so the question is really: can a system asking the right follow-ups help people figure out what they actually want, rather than just record what they already said?

This explores whether well-placed clarifying questions — the back-and-forth that interrupts a request to pin down what's really meant — can help users surface preferences they couldn't state up front. The corpus is encouraging here, because several lines of work treat preference not as something the user hands over fully formed, but as something elicited, inferred, or even discovered through interaction.

The most direct support comes from active questioning. One approach learns a small set of base 'reward' dimensions from many users, then asks each new user only the questions that most reduce uncertainty about where they sit on those dimensions — roughly ten adaptive questions are enough to personalize, with no retraining Can user preferences be learned from just ten questions?. That's exactly the insert-expansion spirit: a short, targeted interrogation that resolves ambiguity. A complementary trick is translating what a user says into what they mean — turning a vague complaint like 'this doesn't look right for a date' into a usable positive preference ('prefer more romantic') Can language models bridge the gap between critique and preference?. Both treat the user's first utterance as an opening, not an answer.

But the corpus also reveals the limits of asking, and that's the part a curious reader might not expect. Some systems learn preferences by watching instead of interrogating — building entity-centric memory of a person across observations and acting on inferred taste without ever posing a question Can agents learn preferences by watching rather than asking?. Others mine activity logs and find that two-thirds of users are pursuing month-long 'interest journeys' — concrete pursuits like 'designing hydroponic systems for small spaces' — that the user might never articulate in a clarifying exchange Can language models discover what users actually want from activity logs?. The discovery there happens behind the user's back, then can be named for them.

There's also a quieter finding about what to do with the answers once you have them: abstract summaries of preference tend to beat replaying specific past interactions Does abstract preference knowledge outperform specific interaction recall?, which suggests the payoff of a good clarifying exchange isn't the transcript but the distilled belief it leaves behind. And because preferences are plural — people hold multiple personas, weighted differently depending on context Can attention mechanisms reveal which user taste explains each recommendation? — the questions that help a user discover themselves may be ones that reveal which version of them is showing up right now.

The thing you didn't know you wanted to know: discovery doesn't only come from asking better questions. Exploration helps users meet preferences they didn't know they had — efficient bandit methods deliberately try uncertain options to learn faster Can neural networks explore efficiently at recommendation scale?, and friends with *different* tastes outperform similar ones precisely because they push users toward anomalous, off-pattern choices Can friends with different tastes improve recommendations?. So the fullest answer to insert-expansion is a pair: clarify what's ambiguous, but also nudge into the unknown — self-discovery lives at both ends.

Sources 8 notes

Can user preferences be learned from just ten questions?

PReF learns base reward functions from preference data, then uses active learning to select maximally informative questions that reduce coefficient uncertainty. Users can be personalized via inference-time reward alignment without weight modification.

Can language models bridge the gap between critique and preference?

Few-shot LLM prompting can convert natural negative feedback like "doesn't look good for a date" into positive preferences like "prefer more romantic," enabling retrieval systems to find better-matching recommendations without fine-tuning.

Can agents learn preferences by watching rather than asking?

M3-Agent demonstrates that separating episodic events from semantic knowledge in an entity-centric graph, combined with parallel memorization and control processes, allows agents to infer and act on user preferences without asking. This architecture mirrors human cognitive systems that bind disparate information about individuals across sensory modalities.

Can language models discover what users actually want from activity logs?

66% of users pursue valued interest journeys lasting over a month, described in specific phrases like 'designing hydroponic systems for small spaces.' LLM-powered journey discovery bridges the semantic gap that collaborative filtering cannot reach, operating at user-level granularity with persona-level precision.

Does abstract preference knowledge outperform specific interaction recall?

PRIME framework shows semantic memory (preference summaries, parametric encodings) consistently beats episodic memory (retrieved past interactions) across models. Recency-based recall outperforms similarity-based retrieval, and task fine-tuning exceeds preference tuning methods.

Show all 8 sources

Can attention mechanisms reveal which user taste explains each recommendation?

AMP-CF represents each user as multiple latent personas weighted dynamically by candidate item. This makes recommendations both diverse and interpretable—each suggestion traces to the specific persona preference it satisfies—without requiring post-hoc reranking.

Can neural networks explore efficiently at recommendation scale?

ENR separates aleatoric from epistemic uncertainty, focusing computation only on parameter uncertainty needed for Thompson sampling. It improved click-through rates 9% and ratings 6% while requiring 29% fewer interactions than baselines.

Can friends with different tastes improve recommendations?

Social Poisson Factorization uses friends' diverse tastes to recommend items outside users' usual preferences, outperforming methods that pull friends' representations together. Networks add value through influence on anomalous choices, not taste similarity.

Papers this line draws on 8

The research behind the notes this line reads — ranked by how closely each paper relates.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a recommendation systems researcher assessing whether insert-expansion (clarifying back-and-forth) remains the primary lever for user preference discovery, or whether the regime has shifted. The question: *Can targeted questioning still compete with passive observation + LLM synthesis for surfacing latent user preferences?*

What a curated library found — and when (findings span 2017–2025; treat as dated claims):
• Active questioning: ~10 adaptive questions suffice to personalize, learning reward dimensions without retraining (2025).
• Critique-to-preference: Vague user complaints ('doesn't look right for a date') translate into usable preferences via LLM bridging (2021).
• Passive discovery outpaces asking: Two-thirds of users pursue month-long 'interest journeys' that emerge from activity logs, never articulated in dialogue (2023).
• Semantic abstraction beats episodic replay: Distilled preference summaries outperform replaying past interactions (2024).
• Exploration + homophily inversion: Deliberately trying uncertain options and receiving recommendations from friends with *different* tastes accelerates self-discovery (2023).

Anchor papers (verify; mind their dates):
• arXiv:2503.06358 — Reward Factorization (2025)
• arXiv:2305.15498 — LLM Interest Journeys (2023)
• arXiv:2109.07576 — Critique-to-Preference (2021)
• arXiv:2306.14834 — Neural Contextual Bandits (2023)

Your task:
(1) RE-TEST EACH CONSTRAINT: For active questioning, judge whether LLM-in-the-loop personalization (e.g., Rec-R1, PRIME) has *compressed* the question budget below 10, or shifted the bottleneck to preference *synthesis* rather than *elicitation*. For passive discovery, assess whether newer models now surface latent interests *faster* than dialogue and whether users prefer implicit over explicit preference-sharing. Separate the durable question—does dialogue still uniquely reveal preferences users didn't know they held?—from the perishable claim that asking is the bottleneck.
(2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months: search for papers on agentic preference inference, retrieval-augmented personalization, or multi-turn dialogue that claim to *obsolete* questionnaire-based elicitation.
(3) Propose 2 research questions assuming the regime has moved:
   • Can LLM-driven *observational synthesis* + one targeted clarification outperform multi-turn questioning on latency *and* accuracy?
   • Do users *discover* preferences faster through serendipitous exploration (bandit + heterophily) than through dialogue, and can dialogue + exploration be sequenced optimally?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Could a handful of smart follow-up questions help you discover preferences you couldn't have stated on your own?

Related lines of inquiry

Sources 8 notes

Papers this line draws on 8