Learn Factoring Sparse Interaction Matrices Using Singular Value Decomposition | Deep Personalization through Matrix Factorization

Swipe to show menu

SVD Overview

Definition

Singular Value Decomposition, or SVD, is a powerful mathematical technique for decomposing a matrix into three simpler matrices.

In the context of recommendation systems, SVD is often used to analyze and compress large, sparse user-item interaction matrices, revealing hidden patterns and relationships.

Mathematical Explanation

Given a matrix A (such as a user-item interaction matrix), SVD factorizes it into three matrices: U, Σ, and V^T. The relationship can be described as:

A = U Σ V^T

U is a matrix whose columns are the left singular vectors;
Σ (Sigma) is a diagonal matrix containing the singular values;
V^T is the transpose of a matrix whose columns are the right singular vectors.

This decomposition allows you to approximate the original matrix using only the most significant singular values and vectors, which is especially useful in high-dimensional, sparse data scenarios.

Role in Recommendations

In recommendation systems, user-item matrices are typically sparse, with many missing entries (for example, unrated products). SVD helps by uncovering latent features that explain observed interactions. By reconstructing the matrix with a reduced number of singular values, you can predict missing values—essentially estimating how a user might rate an item they have not yet interacted with. This enables personalized recommendations based on inferred preferences rather than only explicit data.

Applying SVD to a Sparse User-Item Matrix

Suppose you have a user-item matrix where rows represent users and columns represent items. Many entries are missing (or set to zero), representing unknown user preferences. By applying SVD, you can reduce the matrix to its essential components and use the reconstruction to estimate missing values, which can then be used to generate recommendations.


              123456789101112131415161718192021222324252627282930
            
import numpy as np

# Example sparse user-item matrix (0 indicates missing/unknown rating)
user_item_matrix = np.array([
    [5, 3, 0, 1],
    [4, 0, 0, 1],
    [1, 1, 0, 5],
    [0, 0, 5, 4],
    [0, 1, 5, 4],
])

# Filling missing values (0) with the mean of non-zero elements for SVD
mean_value = user_item_matrix[user_item_matrix != 0].mean()
filled_matrix = np.where(user_item_matrix == 0, mean_value, user_item_matrix)

# Performing SVD
U, sigma, VT = np.linalg.svd(filled_matrix, full_matrices=False)
# Keeping only top 2 singular values for dimensionality reduction
k = 2
U_k = U[:, :k]
sigma_k = np.diag(sigma[:k])
VT_k = VT[:k, :]

# Reconstructing the matrix using only top k components
approx_matrix = np.dot(np.dot(U_k, sigma_k), VT_k)

print('Original matrix with missing values filled:')
print(np.round(filled_matrix, 2))
print('\nApproximated matrix (using top 2 singular values):')
print(np.round(approx_matrix, 2))

1. Which matrix in SVD contains the singular values that capture the importance of each latent feature?

2. What is the primary purpose of applying SVD to a user-item matrix in recommendation systems?

Everything was clear?

Thanks for your feedback!

Section 4. Chapter 2

Ask AI

Ask anything or try one of the suggested questions to begin our chat

Section 4. Chapter 2