Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn PCA Intuition | Introduction to Dimensionality Reduction
Dimensionality Reduction with PCA

bookPCA Intuition

Note
Definition

Principal component analysis (PCA) is a powerful technique that identifies new axes - called principal components - which are directions in your data that capture the most variance.

PCA keeps the directions where your data varies the most, as these capture the key patterns and structure.

Think of PCA like shining a flashlight on a 3D object and examining the shadow on a wall. The angle of the light changes the shadow's detail. PCA finds the best angle so the shadow, or projection, reveals the most about the object's shape. Similarly, PCA projects your data onto new axes to preserve as much variation as possible.

12345678910111213141516171819202122232425262728293031323334
import numpy as np import matplotlib.pyplot as plt # Generate a simple 2D dataset np.random.seed(0) mean = [0, 0] cov = [[3, 2], [2, 2]] # Covariance matrix X = np.random.multivariate_normal(mean, cov, 200) # Compute the mean of the data mean_vector = np.mean(X, axis=0) # Compute the covariance matrix and its eigenvectors cov_matrix = np.cov(X.T) eigenvalues, eigenvectors = np.linalg.eig(cov_matrix) # First principal component (direction of maximum variance) pc1 = eigenvectors[:, np.argmax(eigenvalues)] # Plot the data plt.figure(figsize=(8,6)) plt.scatter(X[:,0], X[:,1], alpha=0.3, label="Data points") plt.quiver( mean_vector[0], mean_vector[1], pc1[0], pc1[1], angles='xy', scale_units='xy', scale=1.5, color='red', width=0.01, label="First principal component" ) plt.xlabel("Feature 1") plt.ylabel("Feature 2") plt.title("Direction of Maximum Variance (First Principal Component)") plt.legend() plt.axis("equal") plt.show()
copy

By identifying the directions where your data varies the most, PCA allows you to reduce dimensions while preserving the most important information. Focusing on these directions of maximum variance ensures that the structure and patterns in your dataset remain clear. This understanding prepares you to explore the mathematical foundation of PCA in upcoming sections.

question mark

What is the main intuition behind principal components in PCA?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 4

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Awesome!

Completion rate improved to 8.33

bookPCA Intuition

Swipe to show menu

Note
Definition

Principal component analysis (PCA) is a powerful technique that identifies new axes - called principal components - which are directions in your data that capture the most variance.

PCA keeps the directions where your data varies the most, as these capture the key patterns and structure.

Think of PCA like shining a flashlight on a 3D object and examining the shadow on a wall. The angle of the light changes the shadow's detail. PCA finds the best angle so the shadow, or projection, reveals the most about the object's shape. Similarly, PCA projects your data onto new axes to preserve as much variation as possible.

12345678910111213141516171819202122232425262728293031323334
import numpy as np import matplotlib.pyplot as plt # Generate a simple 2D dataset np.random.seed(0) mean = [0, 0] cov = [[3, 2], [2, 2]] # Covariance matrix X = np.random.multivariate_normal(mean, cov, 200) # Compute the mean of the data mean_vector = np.mean(X, axis=0) # Compute the covariance matrix and its eigenvectors cov_matrix = np.cov(X.T) eigenvalues, eigenvectors = np.linalg.eig(cov_matrix) # First principal component (direction of maximum variance) pc1 = eigenvectors[:, np.argmax(eigenvalues)] # Plot the data plt.figure(figsize=(8,6)) plt.scatter(X[:,0], X[:,1], alpha=0.3, label="Data points") plt.quiver( mean_vector[0], mean_vector[1], pc1[0], pc1[1], angles='xy', scale_units='xy', scale=1.5, color='red', width=0.01, label="First principal component" ) plt.xlabel("Feature 1") plt.ylabel("Feature 2") plt.title("Direction of Maximum Variance (First Principal Component)") plt.legend() plt.axis("equal") plt.show()
copy

By identifying the directions where your data varies the most, PCA allows you to reduce dimensions while preserving the most important information. Focusing on these directions of maximum variance ensures that the structure and patterns in your dataset remain clear. This understanding prepares you to explore the mathematical foundation of PCA in upcoming sections.

question mark

What is the main intuition behind principal components in PCA?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 4
some-alt