Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
学ぶ Reducing Dimensions by Maximizing Variance | Mathematical Foundations of PCA
Principal Component Analysis in Python

bookReducing Dimensions by Maximizing Variance

メニューを表示するにはスワイプしてください

PCA ranks principal components by the variance they capture, measured by their eigenvalues. Keeping the top k components preserves the most variance, as each component captures less than the previous one and is orthogonal to earlier components. This reduces dimensions while retaining the most informative directions in your data.

The explained variance ratio for each principal component is:

Explained Variance Ratio=λijλj\text{Explained Variance Ratio} = \frac{\lambda_i}{\sum_j \lambda_j}

where λiλ_i is the ii-th largest eigenvalue. This ratio shows how much of the total variance in your data is captured by each principal component. The sum of all explained variance ratios is always 1, since all eigenvalues together account for the total variance in the dataset.

123456789101112
import numpy as np # Using eigenvalues from previous code X = np.array([[2.5, 2.4], [0.5, 0.7], [2.2, 2.9]]) X_centered = X - np.mean(X, axis=0) cov_matrix = (X_centered.T @ X_centered) / X_centered.shape[0] values, vectors = np.linalg.eig(cov_matrix) explained_variance_ratio = values / np.sum(values) print("Explained variance ratio:", explained_variance_ratio)
copy

Selecting the top principal components so that their explained variance ratios add up to a specific threshold - such as 95% - lets you reduce the number of dimensions while keeping most of the data's information. This means you only keep the directions in your data where the spread is greatest, which are the most informative for analysis or modeling. By focusing on these components, you simplify your dataset without losing the patterns that matter most. This balance between dimensionality and information is a key advantage of PCA.

question mark

What does the explained variance ratio represent in principal component analysis (PCA)?

正しい答えを選んでください

すべて明確でしたか?

どのように改善できますか?

フィードバックありがとうございます!

セクション 2.  4

AIに質問する

expand

AIに質問する

ChatGPT

何でも質問するか、提案された質問の1つを試してチャットを始めてください

セクション 2.  4
some-alt