SYNTHESIS NOTE
Recommender Systems Model Architecture and Internals

Can a linear model beat deep collaborative filtering?

Does a shallow linear autoencoder with a zero-diagonal constraint outperform deeper neural models on collaborative filtering tasks? This challenges the field's assumption that depth and nonlinearity drive performance.

Synthesis note · 2026-05-03 · sourced from Recommenders Architectures
What breaks when specialized AI models reach real users?

A surprising empirical result: a linear model with no hidden layer outperforms most deep collaborative-filtering models. ESLER (called easer) is a single item-item weight matrix B trained as an autoencoder where the input vector is the user's interaction history and the output reconstructs the same history. The single non-trivial constraint is that the diagonal of B must be zero — an item cannot use itself to predict itself.

This constraint is doing all the work. Without it, the model trivially copies inputs to outputs and learns nothing. With it, predicting whether a user likes item i forces the model to express i in terms of the other items the user interacted with, which is exactly what generalization in collaborative filtering requires. About 60% of the learned weights turn out to be negative, indicating the model is also learning dissimilarities between items, not just similarities. Setting negative weights to zero degrades performance to roughly the level of L1-regularized SLIM, suggesting that what made easer special wasn't sparsity but the ability to encode anti-affinity.

The closed-form training takes a few lines of code and orders of magnitude less time than SLIM. The result challenges the field's assumption that depth and non-linearity are essential for CF — the right structural constraint matters more than expressive capacity, mirroring the Rendle et al. dot-product result for similarity functions.

Inquiring lines that use this note as a source 38

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 4

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
13 direct connections · 88 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

ESLER easer beats deep models on collaborative filtering by constraining self-similarity to zero — proving model depth is not what mattered