Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model
Recommender systems provide users with personalized suggestions for products or services. These systems often rely on Collaborating Filtering (CF), where past transactions are analyzed in order to establish connections between users and products. The two more successful approaches to CF are latent factor models, which directly profile both users and products, and neighborhood models, which analyze similarities between products or users. In this work we introduce some innovations to both approaches. The factor and neighborhood models can now be smoothly merged, thereby building a more accurate combined model. Further accuracy improvements are achieved by extending the models to exploit both explicit and implicit feedback by the users. The methods are tested on the Netflix data. Results are better than those previously published on that dataset. In addition, we suggest a new evaluation metric, which highlights the differences among methods, based on their performance at a top-K recommendation task. Categories and Subject Descriptors
Introduction. Modern consumers are inundated with choices. Electronic retailers and content providers offer a huge selection of products, with unprecedented opportunities to meet a variety of special needs and tastes. Matching consumers with most appropriate products is not trivial, yet it is a key in enhancing user satisfaction and loyalty. This emphasizes the prominence of recommender systems, which provide personalized recommendations for products that suit a user’s taste [1]. Internet leaders like Amazon, Google, Netflix, TiVo and Yahoo are increasingly adopting such recommenders. Recommender systems are often based on Collaborative Filtering (CF) [10], which relies only on past user behavior—e.g., their
Discussion / Conclusion. This work proposed improvements to two of the most popular approaches to Collaborative Filtering. First, we suggested a new neighborhood based model, which unlike previous neighborhood methods, is based on formally optimizing a global cost function. This leads to improved prediction accuracy, while maintaining merits of the neighborhood approach such as explainability of predictions and ability to handle new users without re-training the model. Second, we introduced extensions to SVD-based latent factor models that allow improved accuracy by integrating implicit feedback into the model. One of the models also provides advantages that are usually regarded as belonging to neighborhood models, namely, an ability to explain recommendations and to handle new users seamlessly. In addition, the new neighborhood model enables us to derive, for the first time, an integrated model that combines the neighborhood and the latent factor models. This is helpful for improving system performance, as the neighborhood and latent factor models address the data at different levels and complement each other.