Learn Derivation of PCA Using Linear Algebra | Mathematical Foundations of PCA

PCA seeks a new set of axes, called principal components, such that the projected data has maximum variance. The first principal component, denoted as $w_{\raisebox{-0.5pt}{$1$}}$ , is chosen to maximize the variance of the projected data:

\mathrm{Var}(X w_1)

Subject to the constraint that $\|w_{\raisebox{-0.5pt}{$1$}}\| = 1$ . The solution to this maximization problem is the eigenvector of the covariance matrix corresponding to the largest eigenvalue.

The optimization problem is:

\max_{w} \ w^T \Sigma w \quad \text{subject to} \quad \|w\| = 1

The solution is any vector $w$ that satisfies $\Sigma w = \lambda w$ , where $\lambda$ is the corresponding eigenvalue. In other words, $w$ is an eigenvector of the covariance matrix $\Sigma$ associated with eigenvalue $\lambda$ .


              12345678910111213
            
import numpy as np

# Assume cov_matrix from earlier
X = np.array([[2.5, 2.4],
              [0.5, 0.7],
              [2.2, 2.9]])
X_centered = X - np.mean(X, axis=0)
cov_matrix = (X_centered.T @ X_centered) / X_centered.shape[0]

# Find the principal component (eigenvector with largest eigenvalue)
values, vectors = np.linalg.eig(cov_matrix)
principal_component = vectors[:, np.argmax(values)]
print("First principal component:", principal_component)

This principal component is the direction along which the data has the highest variance. Projecting data onto this direction gives the most informative one-dimensional representation of the original dataset.

Everything was clear?

Thanks for your feedback!

Section 2. Chapter 3

Ask AI

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain why the principal component is important in PCA?

How do I interpret the values of the principal component?

What does projecting data onto the principal component mean?

Swipe to show menu

\mathrm{Var}(X w_1)

The optimization problem is:

\max_{w} \ w^T \Sigma w \quad \text{subject to} \quad \|w\| = 1


              12345678910111213
            
import numpy as np

# Assume cov_matrix from earlier
X = np.array([[2.5, 2.4],
              [0.5, 0.7],
              [2.2, 2.9]])
X_centered = X - np.mean(X, axis=0)
cov_matrix = (X_centered.T @ X_centered) / X_centered.shape[0]

# Find the principal component (eigenvector with largest eigenvalue)
values, vectors = np.linalg.eig(cov_matrix)
principal_component = vectors[:, np.argmax(values)]
print("First principal component:", principal_component)

Everything was clear?

Thanks for your feedback!

Section 2. Chapter 3