INQUIRING LINE

Can side information alone predict preferences without rating history?

This explores whether attributes and metadata about users and items — demographics, item features, observed behavior — can predict what someone will like when there's no rating history to lean on, the classic 'cold-start' problem.


This explores whether side information (user/item attributes, metadata, observed behavior) can stand in for rating history when predicting preferences. The honest answer from the corpus: side information rarely works *alone*, but it's exactly the lever that lets systems make predictions before any ratings exist — and the interesting question is how it gets fused with everything else.

The cold-start case is where side information earns its keep. Can autoencoders solve the cold-start problem in recommendations? shows that graph autoencoders blending rating history with side information can predict for brand-new users and items precisely because the side features carry signal when interaction data is absent — but note the framing is *combination*, not substitution. Can graphs unify collaborative filtering and side information? pushes the same idea structurally: it folds user-item interactions and item attributes into a single knowledge graph so attribute-similarity ('these two items share properties') and behavior-similarity ('people who liked X liked Y') reinforce each other. The recurring lesson is that attributes and ratings are different similarity channels, and the win comes from letting them propagate through one another rather than betting on either alone.

But there's a more radical reading of 'side information': what if the system *watches* instead of asking for ratings at all? Can agents learn preferences by watching rather than asking? demonstrates agents that infer preferences from continuous multimodal observation — building an entity-centric memory of you without a single explicit rating. That reframes the question entirely: the alternative to rating history isn't necessarily static attributes, it's behavioral observation over time, which is its own kind of side channel.

There's also a twist on what 'preference' even means once you skip ratings. Does abstract preference knowledge outperform specific interaction recall? finds that abstract preference summaries beat replaying specific past interactions — so distilling someone into a compact semantic profile (a form of derived side information) can outperform their raw history. And Can user preferences be learned from just ten questions? shows you can pin down a person with as few as ten well-chosen questions, suggesting that a tiny amount of active elicitation substitutes for a long rating log. Both hint that the volume of history matters less than having the right compressed representation.

The quiet warning underneath all this: rating history may be a noisier signal than you'd assume. Do online ratings actually reflect independent customer opinions? shows ratings are contaminated by social influence from prior ratings, and Do all annotation responses measure the same underlying thing? finds that stated preferences mix genuine taste with non-attitudes and on-the-spot constructions. So 'side information vs. rating history' isn't a clean contest between reliable and unreliable data — both are partial views, which is exactly why the corpus keeps fusing them.


Sources 7 notes

Can autoencoders solve the cold-start problem in recommendations?

GHRS uses graph features and deep autoencoders to integrate rating history with side information, enabling predictions for new users and items by discovering non-linear relationships that linear hybrid methods miss.

Can graphs unify collaborative filtering and side information?

KGAT merges user-item interaction graphs with item knowledge graphs into a Collaborative Knowledge Graph, using attention-based propagation to capture both user-similarity and attribute-similarity signals simultaneously—including high-order connections that standard supervised learning methods miss.

Can agents learn preferences by watching rather than asking?

M3-Agent demonstrates that separating episodic events from semantic knowledge in an entity-centric graph, combined with parallel memorization and control processes, allows agents to infer and act on user preferences without asking. This architecture mirrors human cognitive systems that bind disparate information about individuals across sensory modalities.

Does abstract preference knowledge outperform specific interaction recall?

PRIME framework shows semantic memory (preference summaries, parametric encodings) consistently beats episodic memory (retrieved past interactions) across models. Recency-based recall outperforms similarity-based retrieval, and task fine-tuning exceeds preference tuning methods.

Can user preferences be learned from just ten questions?

PReF learns base reward functions from preference data, then uses active learning to select maximally informative questions that reduce coefficient uncertainty. Users can be personalized via inference-time reward alignment without weight modification.

Do online ratings actually reflect independent customer opinions?

Moe and Trusov decomposed ratings into baseline quality, social-dynamics influence, and error, finding that prior ratings meaningfully affect subsequent ones. These effects have both immediate sales impact and long-term compounding effects through future ratings, though high opinion variance can eventually dampen the distortion.

Do all annotation responses measure the same underlying thing?

Behavioral science reveals that annotations contain genuine preferences, non-attitudes, and constructed preferences—distinguishable by consistency across measurement conditions. Treating them uniformly contaminates reward model training and downstream alignment.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

As a recommendation systems researcher, evaluate whether side information (user/item attributes, metadata, behavior signals) can predict preferences WITHOUT rating history — treating this as still-open despite recent advances.

What a curated library found — and when (dated claims, not current truth): Findings span 2019–2026, but concentrate heavily in 2025–2026.
• Graph autoencoders + side information excel at cold-start (new users/items) but always *combine* attributes with interaction data; neither alone is sufficient (2019–2020).
• Knowledge graphs unifying interactions and item attributes let similarity propagate bidirectionally—attributes and ratings are complementary channels, not substitutes (2019).
• Behavioral observation over time (multimodal, continuous) can infer preferences without explicit ratings, reframing "side information" as episodic memory (2024–2025).
• Compressed semantic preference profiles outperform raw rating history; as few as ~10 well-chosen elicitation questions substitute for long logs (2025).
• Rating history itself is noisy—contaminated by social influence and preference mixing (genuine taste + non-attitudes); side information is not obviously inferior (2019–2026).

Anchor papers (verify; mind their dates):
• arXiv:1905.07854 (2019) — KGAT: Knowledge Graph Attention
• arXiv:2503.06358 (2025) — Reward Factorization for Personalization
• arXiv:2604.03238 (2026) — Measuring Human Preferences as Social Science

Your task:
(1) RE-TEST each constraint. For every finding above, ask: have newer models (multimodal LLMs, vision-language), training tricks (in-context learning, preference fine-tuning), or orchestration (multi-turn dialogue, memory scaffolding) since RELAXED the requirement for rating history? Separate the durable question ("Can we infer latent preferences?") from the perishable limitation ("We need X ratings first"). Cite what relaxed it; flag what still holds.
(2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months—especially papers claiming side information alone suffices, or conversely, that ratings remain irreplaceable.
(3) Propose 2 research questions that assume the regime may have shifted: e.g., "In an LLM-personalization pipeline, does in-context user profile (side info) + 1–2 dialogue turns replace a 50-rating cold-start baseline?" or "Can multimodal behavioral traces (video, text, implicit signals) wholly substitute for explicit preference elicitation?"

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines