Curse of “Low” Dimensionality in Recommender Systems
Beyond accuracy, there are a variety of aspects to the quality of recommender systems, such as diversity, fairness, and robustness. We argue that many of the prevalent problems in recommender systems are partly due to low-dimensionality of user and item embeddings, particularly when dot-product models, such as matrix factorization, are used. In this study, we showcase empirical evidence suggesting the necessity of sufficient dimensionality for user/item embeddings to achieve diverse, fair, and robust recommendation. We then present theoretical analyses of the expressive power of dot-product models. Our theoretical results demonstrate that the number of possible rankings expressible under dot-product models is exponentially bounded by the dimension of item factors. We empirically found that the low-dimensionality contributes to a popularity bias, widening the gap between the rank positions of popular and long-tail items; we also give a theoretical justification for this phenomenon.
Introduction. Matrix factorization (MF) [24] is a standard tool in recommender systems. MF can learn compact, retrieval-efficient representation and thus is easy-to-use, particularly in large-scale applications. On the other hand, due to the advancement of deep learning-based methods, complex nonlinear models have also been adopted to enhance recommendation quality [54]. However, despite the advances in model architecture, most models share a common structure, i.e., dot-product models [47]. Dot-product models are a class of models that estimate the preference for a user-item pair by computing a dot-product (inner product) between the user and item embeddings; MF is the simplest model in this class. This structure is essential for large-scale applications due to its computationally efficient retrieval through vector search algorithms [25, 34, 52]. The dimensionality of user/item embeddings characterizes dotproduct models. One interpretation of dimensionality is that it refers to the ranks of user/item embedding matrices that minimize the errors between true and estimated preferences.
Discussion / Conclusion. Efficient Solvers for High Dimensionality. High-dimensional models are often computationally costly. Even in the most efficient methods based on MF, the optimization involves cubic dependency on dand thus does not scale well for high-dimensional models. Motivated by this, previous studies have explored efficient methods for high-dimensional models [7, 20, 48]. Because the traditional ALS solver for MF is in the class of block coordinate descent (CD), conventional methods can be derived by designing the choice of blocks [48]. It is hence interesting to design block CD methods for high-dimensional models considering efficiency on modern computer architectures (e.g., CPU with SIMD, GPU, and TPU [37]). Developing solvers for more complex models, such as factorization machines [7, 46], is useful to leverage side information. Since deep learning-based models are also employed, efficient solvers for such non-linear models [66] will be beneficial for enhancing the ranking quality.