Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer Reducing Data to 2D/3D and Visualizing with Matplotlib | Section
Principal Component Analysis Fundamentals

bookReducing Data to 2D/3D and Visualizing with Matplotlib

Veeg om het menu te tonen

Visualizing data with the first two or three principal components helps you spot patterns and clusters that are hidden in high-dimensional space. By projecting data onto these components, you can see groupings that reveal the dataset's structure. This is especially useful for datasets like Iris, where reducing to 2D or 3D makes it easier to distinguish between classes and understand the data visually.

123456789101112131415161718192021222324252627282930313233343536
# 2D scatter plot of the first two principal components import matplotlib.pyplot as plt import seaborn as sns from sklearn.decomposition import PCA from sklearn.datasets import load_iris from sklearn.preprocessing import StandardScaler # Load and scale the data data = load_iris() X = data.data X_scaled = StandardScaler().fit_transform(X) # Fit PCA and transform to 2D pca = PCA(n_components=2) X_pca = pca.fit_transform(X_scaled) plt.figure(figsize=(8,6)) sns.scatterplot(x=X_pca[:,0], y=X_pca[:,1], hue=data.target, palette='Set1', s=60) plt.xlabel('Principal Component 1') plt.ylabel('Principal Component 2') plt.title('PCA - Iris Dataset (2D)') plt.legend(title='Species') plt.show() # 3D visualization from mpl_toolkits.mplot3d import Axes3D pca_3d = PCA(n_components=3) X_pca_3d = pca_3d.fit_transform(X_scaled) fig = plt.figure(figsize=(8,6)) ax = fig.add_subplot(111, projection='3d') scatter = ax.scatter(X_pca_3d[:,0], X_pca_3d[:,1], X_pca_3d[:,2], c=data.target, cmap='Set1', s=60) ax.set_xlabel('PC1') ax.set_ylabel('PC2') ax.set_zlabel('PC3') plt.title('PCA - Iris Dataset (3D)') plt.show()
copy

The 2D scatter plot shows how samples are distributed along the first two principal components, often revealing clusters corresponding to different classes. The 3D plot can provide even more separation if the third component adds significant variance. By visualizing the data in this way, you gain insights into how well PCA is capturing the essential structure of your dataset and whether further dimensionality reduction might be appropriate for your analysis.

question mark

What does it typically indicate if samples from different classes form distinct clusters in a 2D or 3D PCA plot of a dataset

Selecteer het correcte antwoord

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 1. Hoofdstuk 11

Vraag AI

expand

Vraag AI

ChatGPT

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

Sectie 1. Hoofdstuk 11
some-alt