Conversational Recommenders

Why do queries and their causes seem semantically different?

Information retrieval systems find passages matching query language, but what if the segment that actually caused a user's question says something quite different? This explores when semantic similarity fails to find causal relevance.

What makes conversational recommenders hard to build well?

Most assume the challenge is language fluency, but what if the real problem is managing mixed-initiative dialogue—where both users and systems take turns driving the conversation?

Can language models bridge the gap between critique and preference?

When users express what they dislike rather than what they want, can LLMs reliably transform those critiques into positive preferences that retrieval systems can actually use?

Does conversation order matter for recommending items in dialogue?

Conversational recommendation systems typically ignore the sequence in which items are mentioned, treating dialogue as a bag of entities. But does the order itself carry predictive signal about what to recommend next?

Do simulated training interactions transfer to real conversations?

Most conversational recommender systems train on simulated entity-level exchanges, not natural dialogue. The question is whether models built this way actually work when deployed with real users who speak naturally and deviate from expected patterns.

Can unified policy learning improve conversational recommender systems?

This explores whether formulating attribute-asking, item-recommending, and timing decisions as a single reinforcement learning policy outperforms treating them as separate components. The question matters because joint optimization could improve conversation quality and system scalability.

Can conversational recommenders recover lost preference signals from history?

Conversational recommenders abandoned item and user similarity signals when they shifted to dialogue-focused design. Can integrating historical sessions and look-alike users restore these channels without losing dialogue benefits?

How should LLM-based recommenders retrieve from massive item corpora?

When conversational recommenders need to search millions of items, the LLM cannot memorize the corpus. What retrieval strategies work best under different constraints, and how do they trade off latency, sample efficiency, and scalability?

Where does LLM recommendation bias actually come from?

Do conversational AI systems inherit popularity bias from their training data or from the datasets they're deployed on? Understanding the source matters for knowing how to fix it.

How can LLM agents handle huge candidate lists without breaking?

ReAct agents fail when retrieval tools return hundreds of items that overflow prompts. What architectural changes let LLMs work effectively with large candidate sets in recommendation systems?

Can controlled latent variables make LLM user simulators realistic?

Can session-level and turn-level latent variables steer LLM-based user simulators toward realistic dialogue while maintaining measurable diversity and ground truth labels for training conversational systems?

Do LLMs in conversational recommendation systems use collaborative or content knowledge?

Conversational recommenders powered by LLMs might rely on either collaborative signals (user interaction patterns) or content/context knowledge (semantic understanding). Understanding which signal dominates would reveal how to design and deploy these systems effectively.