Learn Reducing Dimensions by Maximizing Variance | Mathematical Foundations of PCA

PCA ranks principal components by the variance they capture, measured by their eigenvalues. Keeping the top k components preserves the most variance, as each component captures less than the previous one and is orthogonal to earlier components. This reduces dimensions while retaining the most informative directions in your data.

The explained variance ratio for each principal component is:

\text{Explained Variance Ratio} = \frac{\lambda_i}{\sum_j \lambda_j}

where $λ_i$ is the $i$ -th largest eigenvalue. This ratio shows how much of the total variance in your data is captured by each principal component. The sum of all explained variance ratios is always 1, since all eigenvalues together account for the total variance in the dataset.


              123456789101112
            
import numpy as np

# Using eigenvalues from previous code
X = np.array([[2.5, 2.4],
              [0.5, 0.7],
              [2.2, 2.9]])
X_centered = X - np.mean(X, axis=0)
cov_matrix = (X_centered.T @ X_centered) / X_centered.shape[0]
values, vectors = np.linalg.eig(cov_matrix)

explained_variance_ratio = values / np.sum(values)
print("Explained variance ratio:", explained_variance_ratio)

Selecting the top principal components so that their explained variance ratios add up to a specific threshold - such as 95% - lets you reduce the number of dimensions while keeping most of the data's information. This means you only keep the directions in your data where the spread is greatest, which are the most informative for analysis or modeling. By focusing on these components, you simplify your dataset without losing the patterns that matter most. This balance between dimensionality and information is a key advantage of PCA.

Everything was clear?

Thanks for your feedback!

Section 2. Chapter 4

Ask AI

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain how to choose the optimal number of principal components?

What happens if I keep too few or too many principal components?

Can you show how to calculate the cumulative explained variance?

Swipe to show menu

The explained variance ratio for each principal component is:

\text{Explained Variance Ratio} = \frac{\lambda_i}{\sum_j \lambda_j}


              123456789101112
            
import numpy as np

# Using eigenvalues from previous code
X = np.array([[2.5, 2.4],
              [0.5, 0.7],
              [2.2, 2.9]])
X_centered = X - np.mean(X, axis=0)
cov_matrix = (X_centered.T @ X_centered) / X_centered.shape[0]
values, vectors = np.linalg.eig(cov_matrix)

explained_variance_ratio = values / np.sum(values)
print("Explained variance ratio:", explained_variance_ratio)

Everything was clear?

Thanks for your feedback!

Section 2. Chapter 4