13.7. Deep Recommender Systems
Let’s delve into some of the recent developments in recommender systems. These advancements are primarily rooted in deep learning, which you can think of as a black-box approach where you input data and receive predictions. Under the hood, these models employ gradient descent to estimate various parameters.
There are two key extensions for recommender systems using neural networks. The first involves using deep learning to create latent features for items or users. Spotify, for example, employed a factor model to obtain multi-dimensional feature vectors for their music. They wanted to see if they could directly construct these latent vectors for music without relying on user data. For this, they framed it as a regression problem, used mean square error as the evaluation metric, and trained a network to transform input audio into these 40-dimensional vectors. This is how they embedded information for their music. Once you have item features, you can similarly create user features.
Another approach is to employ autoencoders for nonlinear embedding. Autoencoders take an input, pass it through a series of layers, and try to reconstruct the original input. By minimizing the mean square error between the input and the reconstructed output, you can train a neural network. Once this network is established, you can use the lower-dimensional representation generated by the autoencoder for your original input.
Moving on, there are two-stage recommender systems. First, there’s a filtering stage where a vast catalog is narrowed down to a smaller set of items, typically around 50. Then, a ranking algorithm is employed to determine the preferences of the user for these 50 items, assigning them probabilities or numeric values. The top-ranked items are recommended to the user. This two-stage process often involves simple matching in the filtering stage and more complex models in the ranking stage, where user information is used.
One popular model for the ranking stage, developed by Google, is the Deep and Wide Model (Link). It combines a deep neural network, which can learn complex relationships, with a wide model, essentially a simple logistic regression model that excels at remembering specific item-user interactions. This hybrid model strikes a balance between generalization and memorization, making it versatile not just for recommender systems but also for various classification tasks.