Impara Visualizing Decorrelated Features | Whitening and Decorrelation

When you examine a dataset with correlated features, you often notice that the data points form an elongated cloud in scatter plots. This shape reflects the underlying relationships between variables. Whitening is a transformation that changes both the shape and orientation of this data cloud. After whitening, the features become uncorrelated and have unit variance, causing the cloud to appear more spherical and aligned with the axes. This transformation is crucial for algorithms sensitive to feature correlation, as it ensures each feature contributes independently.


              123456789101112131415161718192021222324252627282930313233343536
            
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler

# Create a correlated 2D dataset
np.random.seed(0)
mean = [0, 0]
cov = [[3, 2.5], [2.5, 3]]  # strong positive correlation
X = np.random.multivariate_normal(mean, cov, 500)

# Standardize the data
scaler = StandardScaler()
X_std = scaler.fit_transform(X)

# Whitening transformation using eigenvalue decomposition
cov_matrix = np.cov(X_std, rowvar=False)
eigvals, eigvecs = np.linalg.eigh(cov_matrix)
whitening_matrix = eigvecs @ np.diag(1.0 / np.sqrt(eigvals)) @ eigvecs.T
X_white = X_std @ whitening_matrix

# Plot before and after whitening
fig, axes = plt.subplots(1, 2, figsize=(12, 5))
axes[0].scatter(X_std[:, 0], X_std[:, 1], alpha=0.5)
axes[0].set_title("Before Whitening")
axes[0].set_xlabel("Feature 1")
axes[0].set_ylabel("Feature 2")
axes[0].axis("equal")

axes[1].scatter(X_white[:, 0], X_white[:, 1], alpha=0.5, color="green")
axes[1].set_title("After Whitening")
axes[1].set_xlabel("Feature 1")
axes[1].set_ylabel("Feature 2")
axes[1].axis("equal")

plt.tight_layout()
plt.show()

Definition

Sphering is another term for whitening. It refers to transforming a dataset so that its covariance matrix becomes the identity matrix, making the data have zero correlation and unit variance along all axes.

Tutto è chiaro?

Grazie per i tuoi commenti!

Sezione 3. Capitolo 3

Chieda ad AI

Chieda pure quello che desidera o provi una delle domande suggerite per iniziare la nostra conversazione

Suggested prompts:

Can you explain why whitening is important for machine learning algorithms?

What is the difference between standardization and whitening?

Can you describe how the whitening transformation works step by step?

Scorri per mostrare il menu


              123456789101112131415161718192021222324252627282930313233343536
            
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler

# Create a correlated 2D dataset
np.random.seed(0)
mean = [0, 0]
cov = [[3, 2.5], [2.5, 3]]  # strong positive correlation
X = np.random.multivariate_normal(mean, cov, 500)

# Standardize the data
scaler = StandardScaler()
X_std = scaler.fit_transform(X)

# Whitening transformation using eigenvalue decomposition
cov_matrix = np.cov(X_std, rowvar=False)
eigvals, eigvecs = np.linalg.eigh(cov_matrix)
whitening_matrix = eigvecs @ np.diag(1.0 / np.sqrt(eigvals)) @ eigvecs.T
X_white = X_std @ whitening_matrix

# Plot before and after whitening
fig, axes = plt.subplots(1, 2, figsize=(12, 5))
axes[0].scatter(X_std[:, 0], X_std[:, 1], alpha=0.5)
axes[0].set_title("Before Whitening")
axes[0].set_xlabel("Feature 1")
axes[0].set_ylabel("Feature 2")
axes[0].axis("equal")

axes[1].scatter(X_white[:, 0], X_white[:, 1], alpha=0.5, color="green")
axes[1].set_title("After Whitening")
axes[1].set_xlabel("Feature 1")
axes[1].set_ylabel("Feature 2")
axes[1].axis("equal")

plt.tight_layout()
plt.show()

Definition

Tutto è chiaro?

Grazie per i tuoi commenti!

Sezione 3. Capitolo 3