Can language models bridge the gap between critique and preference?
When users express what they dislike rather than what they want, can LLMs reliably transform those critiques into positive preferences that retrieval systems can actually use?
In conversational recommendation people often state preferences as critiques of the current candidate rather than as positive descriptions of what they want. "It doesn't look good for a date" tells you something the user wants — a date-suitable place — but expressed as the negation of a property of the current option. Conventional retrieval systems can't directly act on critiques because their indexes match positive descriptors of items, not negations of properties.
The proposal is to use a large neural language model in few-shot mode to transform the critique into a positive preference. "It doesn't look good for a date" becomes "I prefer more romantic." The transformed preference is then used to retrieve reviews that mention the matching positive aspect — "Perfect for a romantic dinner" becomes a candidate review.
This works because LLMs can perform the common-sense inference required to convert a negation into a preference: knowing that "good for a date" implies "romantic" or "intimate," that the negation of one suggests the affirmation of an opposite, and that the relevant aspects to surface depend on the domain. Few-shot prompting with examples is enough to elicit this transformation; no fine-tuning is required.
The architectural pattern is general: when user feedback is naturally expressed in a form the indexing system can't consume, use an LLM as a translator between the feedback and the index's vocabulary. The LLM doesn't need to be the recommender — it just needs to bridge the linguistic gap between user expression and retrieval representation. This separates the conversational interface from the retrieval infrastructure cleanly, which means the retrieval can stay efficient (review embeddings) while the interface becomes natural.
Inquiring lines that use this note as a source 25
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Can mention sequences exploit shortcuts like repeated items rather than learning genuine preferences?
- Can unified policies handle negative feedback and critique transformation simultaneously?
- What constrains LLM generation beyond default politeness in review contexts?
- How do retrieval systems handle feedback expressed as negations rather than preferences?
- Why do users naturally express recommendations critiques instead of positive preferences?
- What makes few-shot prompting sufficient for critique-to-preference transformation without fine-tuning?
- Does transforming critiques into preferences change how conversational recommenders should decide when to ask versus recommend?
- How do implicit signals like clicks capture preference more reliably than explicit ratings?
- Can preference dimensions extracted from outputs replace topic-based user summaries?
- How should unobserved items differ from items rated zero preference?
- How do semantic reward shaping approaches compare to full critique models?
- How do comparison and debate questions differ in their aspect retrieval needs?
- How do text-based preference summaries compare to embedding vectors for conditioning?
- Can negative feedback through critiques achieve the same steering flexibility as positive preferences?
- Can critique-only calls in LLMs exploit a measurable gap between generation and evaluation?
- Why does sentiment polarity matching matter more than relevance alone?
- Do dialogue systems need different retrieval strategies for opinions versus factual knowledge?
- How do users signal satisfaction through implicit cues that training data misses?
- How can insert-expansion techniques help users discover their own preferences?
- What role do model-based critics play in validating LLM plans?
- What stops AI from helping users articulate preferences they cannot express?
- Can a rejected-edit buffer work like hard negatives in contrastive learning?
- What preference data do different personalized alignment methods actually need?
- Can users modify their preference summaries to steer model behavior?
- Why do untrained summarizers focus on topics rather than preference dimensions?
Related concepts in this collection 5
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Why do queries and documents occupy different embedding spaces?
Queries and documents express the same information in fundamentally different ways—short and interrogative versus long and declarative. Understanding this mismatch is crucial for why direct embedding retrieval often fails.
complements: HyDE generates a hypothetical answer to bridge the query-document gap; critique-to-preference generates a hypothetical positive preference to bridge the negation-vocabulary gap — same architectural pattern in a different domain
-
Can implicit feedback reveal both preference and confidence?
When users take implicit actions like purchases or watches, do those signals carry two separable pieces of information: what they prefer and how certain we should be? Explicit ratings can't make that distinction.
complements: critiques are a third feedback type beyond explicit and implicit — natural-language negative signal that transforms into preference
-
Can unified policy learning improve conversational recommender systems?
This explores whether formulating attribute-asking, item-recommending, and timing decisions as a single reinforcement learning policy outperforms treating them as separate components. The question matters because joint optimization could improve conversation quality and system scalability.
complements: critique-handling is a sub-policy within the broader CRS policy space
-
Can users steer recommendations with natural language at inference?
Can recommendation systems let users specify their preferences in natural language at inference time without retraining? This matters because it would let new users and existing users dynamically adjust what they want to see.
extends: both let users steer recommendations via natural language at inference time; preference discerning starts from positive preferences while critique transformation starts from negative ones
-
Why do users drift away from their original information need?
When users know their knowledge is incomplete but cannot articulate what's missing, do they unintentionally shift topics? And can real-time systems detect this drift?
complements: critiques surface as users discover what they don't want — the negation expresses an articulation gap the LLM bridges
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- "It doesn't look good for a date": Transforming Critiques into Preferences for Conversational Recommendation Systems
- Preference Discerning with LLM-Enhanced Generative Retrieval
- Style Vectors for Steering Generative Large Language Models
- Large Language Models are Zero-Shot Rankers for Recommender Systems
- Leveraging Large Language Models in Conversational Recommender Systems
- Large Language Models as Conversational Movie Recommenders: A User Study
- When Large Language Models contradict humans? Large Language Models’ Sycophantic Behaviour
- Exploring the Impact of Large Language Models on Recommender Systems: An Extensive Review
Original note title
critique-to-preference transformation enables retrieving better recommendations from natural negative feedback