Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Reducing Data to 2D/3D and Visualizing with Matplotlib | Implementing PCA in Python
Dimensionality Reduction with PCA

bookReducing Data to 2D/3D and Visualizing with Matplotlib

Visualizing data with the first two or three principal components helps you spot patterns and clusters that are hidden in high-dimensional space. By projecting data onto these components, you can see groupings that reveal the dataset's structure. This is especially useful for datasets like Iris, where reducing to 2D or 3D makes it easier to distinguish between classes and understand the data visually.

123456789101112131415161718192021222324252627282930313233343536
# 2D scatter plot of the first two principal components import matplotlib.pyplot as plt import seaborn as sns from sklearn.decomposition import PCA from sklearn.datasets import load_iris from sklearn.preprocessing import StandardScaler # Load and scale the data data = load_iris() X = data.data X_scaled = StandardScaler().fit_transform(X) # Fit PCA and transform to 2D pca = PCA(n_components=2) X_pca = pca.fit_transform(X_scaled) plt.figure(figsize=(8,6)) sns.scatterplot(x=X_pca[:,0], y=X_pca[:,1], hue=data.target, palette='Set1', s=60) plt.xlabel('Principal Component 1') plt.ylabel('Principal Component 2') plt.title('PCA - Iris Dataset (2D)') plt.legend(title='Species') plt.show() # 3D visualization from mpl_toolkits.mplot3d import Axes3D pca_3d = PCA(n_components=3) X_pca_3d = pca_3d.fit_transform(X_scaled) fig = plt.figure(figsize=(8,6)) ax = fig.add_subplot(111, projection='3d') scatter = ax.scatter(X_pca_3d[:,0], X_pca_3d[:,1], X_pca_3d[:,2], c=data.target, cmap='Set1', s=60) ax.set_xlabel('PC1') ax.set_ylabel('PC2') ax.set_zlabel('PC3') plt.title('PCA - Iris Dataset (3D)') plt.show()
copy

The 2D scatter plot shows how samples are distributed along the first two principal components, often revealing clusters corresponding to different classes. The 3D plot can provide even more separation if the third component adds significant variance. By visualizing the data in this way, you gain insights into how well PCA is capturing the essential structure of your dataset and whether further dimensionality reduction might be appropriate for your analysis.

question mark

What does it typically indicate if samples from different classes form distinct clusters in a 2D or 3D PCA plot of a dataset

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 3. ChapterΒ 3

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Awesome!

Completion rate improved to 8.33

bookReducing Data to 2D/3D and Visualizing with Matplotlib

Swipe to show menu

Visualizing data with the first two or three principal components helps you spot patterns and clusters that are hidden in high-dimensional space. By projecting data onto these components, you can see groupings that reveal the dataset's structure. This is especially useful for datasets like Iris, where reducing to 2D or 3D makes it easier to distinguish between classes and understand the data visually.

123456789101112131415161718192021222324252627282930313233343536
# 2D scatter plot of the first two principal components import matplotlib.pyplot as plt import seaborn as sns from sklearn.decomposition import PCA from sklearn.datasets import load_iris from sklearn.preprocessing import StandardScaler # Load and scale the data data = load_iris() X = data.data X_scaled = StandardScaler().fit_transform(X) # Fit PCA and transform to 2D pca = PCA(n_components=2) X_pca = pca.fit_transform(X_scaled) plt.figure(figsize=(8,6)) sns.scatterplot(x=X_pca[:,0], y=X_pca[:,1], hue=data.target, palette='Set1', s=60) plt.xlabel('Principal Component 1') plt.ylabel('Principal Component 2') plt.title('PCA - Iris Dataset (2D)') plt.legend(title='Species') plt.show() # 3D visualization from mpl_toolkits.mplot3d import Axes3D pca_3d = PCA(n_components=3) X_pca_3d = pca_3d.fit_transform(X_scaled) fig = plt.figure(figsize=(8,6)) ax = fig.add_subplot(111, projection='3d') scatter = ax.scatter(X_pca_3d[:,0], X_pca_3d[:,1], X_pca_3d[:,2], c=data.target, cmap='Set1', s=60) ax.set_xlabel('PC1') ax.set_ylabel('PC2') ax.set_zlabel('PC3') plt.title('PCA - Iris Dataset (3D)') plt.show()
copy

The 2D scatter plot shows how samples are distributed along the first two principal components, often revealing clusters corresponding to different classes. The 3D plot can provide even more separation if the third component adds significant variance. By visualizing the data in this way, you gain insights into how well PCA is capturing the essential structure of your dataset and whether further dimensionality reduction might be appropriate for your analysis.

question mark

What does it typically indicate if samples from different classes form distinct clusters in a 2D or 3D PCA plot of a dataset

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 3. ChapterΒ 3
some-alt