Lære Visualizing and Interpreting Coefficient Shrinkage | Advanced Regularization and Model Interpretation

Understanding coefficient shrinkage is crucial when working with regularized regression models such as Ridge, Lasso, and ElasticNet. In these models, regularization penalizes large coefficients, forcing the model to keep them small or even eliminate some entirely. This process is known as coefficient shrinkage. Shrinkage helps prevent overfitting and encourages simpler, more interpretable models. In particular, Lasso (L1 regularization) can drive some coefficients exactly to zero, effectively performing feature selection by removing less important features. Ridge (L2 regularization) shrinks all coefficients towards zero but rarely makes them exactly zero, while ElasticNet combines both penalties, offering a balance between the two effects. Interpreting how these coefficients change as the regularization strength increases can help you understand which features the model considers most useful and how robust your model is to irrelevant or redundant features.


              1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
            
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import make_regression
from sklearn.linear_model import Ridge, Lasso, ElasticNet

# Generate synthetic regression data
X, y, coef_true = make_regression(
    n_samples=100,
    n_features=10,
    n_informative=5,
    coef=True,
    noise=10,
    random_state=42
)

alphas = np.logspace(-2, 2, 50)
coefs_ridge = []
coefs_lasso = []
coefs_enet = []

for alpha in alphas:
    ridge = Ridge(alpha=alpha, fit_intercept=False, random_state=42)
    lasso = Lasso(alpha=alpha, fit_intercept=False, max_iter=10000, random_state=42)
    enet = ElasticNet(alpha=alpha, l1_ratio=0.5, fit_intercept=False, max_iter=10000, random_state=42)
    
    ridge.fit(X, y)
    lasso.fit(X, y)
    enet.fit(X, y)
    
    coefs_ridge.append(ridge.coef_)
    coefs_lasso.append(lasso.coef_)
    coefs_enet.append(enet.coef_)

plt.figure(figsize=(18, 5))

# Ridge paths
plt.subplot(1, 3, 1)
sns.set_palette("tab10")
for i in range(X.shape[1]):
    plt.plot(alphas, [coef[i] for coef in coefs_ridge], label=f'Feature {i}' if i < 5 else None)
plt.xscale('log')
plt.title('Ridge Coefficient Paths')
plt.xlabel('Alpha (Regularization Strength)')
plt.ylabel('Coefficient Value')
plt.axhline(0, color='grey', linestyle='--', linewidth=1)
plt.legend(loc="upper right", ncol=2, fontsize=8, frameon=False)

# Lasso paths
plt.subplot(1, 3, 2)
for i in range(X.shape[1]):
    plt.plot(alphas, [coef[i] for coef in coefs_lasso], label=f'Feature {i}' if i < 5 else None)
plt.xscale('log')
plt.title('Lasso Coefficient Paths')
plt.xlabel('Alpha (Regularization Strength)')
plt.axhline(0, color='grey', linestyle='--', linewidth=1)

# ElasticNet paths
plt.subplot(1, 3, 3)
for i in range(X.shape[1]):
    plt.plot(alphas, [coef[i] for coef in coefs_enet], label=f'Feature {i}' if i < 5 else None)
plt.xscale('log')
plt.title('ElasticNet Coefficient Paths')
plt.xlabel('Alpha (Regularization Strength)')
plt.axhline(0, color='grey', linestyle='--', linewidth=1)

plt.tight_layout()
plt.show()

The visualization above shows how each model's coefficients respond as the regularization strength (alpha) increases. With Ridge, coefficients for all features are gradually shrunk towards zero, but none are eliminated entirely. In contrast, Lasso drives some coefficients exactly to zero as alpha increases, effectively removing those features from the model. This means Lasso selects only a subset of features, retaining the ones with the strongest signal. ElasticNet combines both effects: it shrinks coefficients and can set some to zero, but typically retains more features than Lasso alone. By examining which features remain nonzero at higher regularization strengths, you can identify the most important predictors in your data and gain insight into the model's decision process. This interpretability is especially valuable when you need to justify feature choices or understand the impact of regularization on your model.

Var alt klart?

Tak for dine kommentarer!

Sektion 3. Kapitel 3

Spørg AI

Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat

Suggested prompts:

Can you explain the main differences between Ridge, Lasso, and ElasticNet in more detail?

How do I interpret the coefficient paths in the plots?

What are some practical tips for choosing the right regularization method for my data?

Awesome!

Completion rate improved to 8.33

Stryg for at vise menuen


              1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
            
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import make_regression
from sklearn.linear_model import Ridge, Lasso, ElasticNet

# Generate synthetic regression data
X, y, coef_true = make_regression(
    n_samples=100,
    n_features=10,
    n_informative=5,
    coef=True,
    noise=10,
    random_state=42
)

alphas = np.logspace(-2, 2, 50)
coefs_ridge = []
coefs_lasso = []
coefs_enet = []

for alpha in alphas:
    ridge = Ridge(alpha=alpha, fit_intercept=False, random_state=42)
    lasso = Lasso(alpha=alpha, fit_intercept=False, max_iter=10000, random_state=42)
    enet = ElasticNet(alpha=alpha, l1_ratio=0.5, fit_intercept=False, max_iter=10000, random_state=42)
    
    ridge.fit(X, y)
    lasso.fit(X, y)
    enet.fit(X, y)
    
    coefs_ridge.append(ridge.coef_)
    coefs_lasso.append(lasso.coef_)
    coefs_enet.append(enet.coef_)

plt.figure(figsize=(18, 5))

# Ridge paths
plt.subplot(1, 3, 1)
sns.set_palette("tab10")
for i in range(X.shape[1]):
    plt.plot(alphas, [coef[i] for coef in coefs_ridge], label=f'Feature {i}' if i < 5 else None)
plt.xscale('log')
plt.title('Ridge Coefficient Paths')
plt.xlabel('Alpha (Regularization Strength)')
plt.ylabel('Coefficient Value')
plt.axhline(0, color='grey', linestyle='--', linewidth=1)
plt.legend(loc="upper right", ncol=2, fontsize=8, frameon=False)

# Lasso paths
plt.subplot(1, 3, 2)
for i in range(X.shape[1]):
    plt.plot(alphas, [coef[i] for coef in coefs_lasso], label=f'Feature {i}' if i < 5 else None)
plt.xscale('log')
plt.title('Lasso Coefficient Paths')
plt.xlabel('Alpha (Regularization Strength)')
plt.axhline(0, color='grey', linestyle='--', linewidth=1)

# ElasticNet paths
plt.subplot(1, 3, 3)
for i in range(X.shape[1]):
    plt.plot(alphas, [coef[i] for coef in coefs_enet], label=f'Feature {i}' if i < 5 else None)
plt.xscale('log')
plt.title('ElasticNet Coefficient Paths')
plt.xlabel('Alpha (Regularization Strength)')
plt.axhline(0, color='grey', linestyle='--', linewidth=1)

plt.tight_layout()
plt.show()

Var alt klart?

Tak for dine kommentarer!

Sektion 3. Kapitel 3