Aprenda Visualizing and Interpreting Coefficient Shrinkage | Advanced Regularization and Model Interpretation

Understanding coefficient shrinkage is crucial when working with regularized regression models such as Ridge, Lasso, and ElasticNet. In these models, regularization penalizes large coefficients, forcing the model to keep them small or even eliminate some entirely. This process is known as coefficient shrinkage. Shrinkage helps prevent overfitting and encourages simpler, more interpretable models. In particular, Lasso (L1 regularization) can drive some coefficients exactly to zero, effectively performing feature selection by removing less important features. Ridge (L2 regularization) shrinks all coefficients towards zero but rarely makes them exactly zero, while ElasticNet combines both penalties, offering a balance between the two effects. Interpreting how these coefficients change as the regularization strength increases can help you understand which features the model considers most useful and how robust your model is to irrelevant or redundant features.


              1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
            
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import make_regression
from sklearn.linear_model import Ridge, Lasso, ElasticNet

# Generate synthetic regression data
X, y, coef_true = make_regression(
    n_samples=100,
    n_features=10,
    n_informative=5,
    coef=True,
    noise=10,
    random_state=42
)

alphas = np.logspace(-2, 2, 50)
coefs_ridge = []
coefs_lasso = []
coefs_enet = []

for alpha in alphas:
    ridge = Ridge(alpha=alpha, fit_intercept=False, random_state=42)
    lasso = Lasso(alpha=alpha, fit_intercept=False, max_iter=10000, random_state=42)
    enet = ElasticNet(alpha=alpha, l1_ratio=0.5, fit_intercept=False, max_iter=10000, random_state=42)
    
    ridge.fit(X, y)
    lasso.fit(X, y)
    enet.fit(X, y)
    
    coefs_ridge.append(ridge.coef_)
    coefs_lasso.append(lasso.coef_)
    coefs_enet.append(enet.coef_)

plt.figure(figsize=(18, 5))

# Ridge paths
plt.subplot(1, 3, 1)
sns.set_palette("tab10")
for i in range(X.shape[1]):
    plt.plot(alphas, [coef[i] for coef in coefs_ridge], label=f'Feature {i}' if i < 5 else None)
plt.xscale('log')
plt.title('Ridge Coefficient Paths')
plt.xlabel('Alpha (Regularization Strength)')
plt.ylabel('Coefficient Value')
plt.axhline(0, color='grey', linestyle='--', linewidth=1)
plt.legend(loc="upper right", ncol=2, fontsize=8, frameon=False)

# Lasso paths
plt.subplot(1, 3, 2)
for i in range(X.shape[1]):
    plt.plot(alphas, [coef[i] for coef in coefs_lasso], label=f'Feature {i}' if i < 5 else None)
plt.xscale('log')
plt.title('Lasso Coefficient Paths')
plt.xlabel('Alpha (Regularization Strength)')
plt.axhline(0, color='grey', linestyle='--', linewidth=1)

# ElasticNet paths
plt.subplot(1, 3, 3)
for i in range(X.shape[1]):
    plt.plot(alphas, [coef[i] for coef in coefs_enet], label=f'Feature {i}' if i < 5 else None)
plt.xscale('log')
plt.title('ElasticNet Coefficient Paths')
plt.xlabel('Alpha (Regularization Strength)')
plt.axhline(0, color='grey', linestyle='--', linewidth=1)

plt.tight_layout()
plt.show()

The visualization above shows how each model's coefficients respond as the regularization strength (alpha) increases. With Ridge, coefficients for all features are gradually shrunk towards zero, but none are eliminated entirely. In contrast, Lasso drives some coefficients exactly to zero as alpha increases, effectively removing those features from the model. This means Lasso selects only a subset of features, retaining the ones with the strongest signal. ElasticNet combines both effects: it shrinks coefficients and can set some to zero, but typically retains more features than Lasso alone. By examining which features remain nonzero at higher regularization strengths, you can identify the most important predictors in your data and gain insight into the model's decision process. This interpretability is especially valuable when you need to justify feature choices or understand the impact of regularization on your model.

Tudo estava claro?

Obrigado pelo seu feedback!

Seção 3. Capítulo 3

Pergunte à IA

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo

Awesome!

Completion rate improved to 8.33

Deslize para mostrar o menu


              1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
            
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import make_regression
from sklearn.linear_model import Ridge, Lasso, ElasticNet

# Generate synthetic regression data
X, y, coef_true = make_regression(
    n_samples=100,
    n_features=10,
    n_informative=5,
    coef=True,
    noise=10,
    random_state=42
)

alphas = np.logspace(-2, 2, 50)
coefs_ridge = []
coefs_lasso = []
coefs_enet = []

for alpha in alphas:
    ridge = Ridge(alpha=alpha, fit_intercept=False, random_state=42)
    lasso = Lasso(alpha=alpha, fit_intercept=False, max_iter=10000, random_state=42)
    enet = ElasticNet(alpha=alpha, l1_ratio=0.5, fit_intercept=False, max_iter=10000, random_state=42)
    
    ridge.fit(X, y)
    lasso.fit(X, y)
    enet.fit(X, y)
    
    coefs_ridge.append(ridge.coef_)
    coefs_lasso.append(lasso.coef_)
    coefs_enet.append(enet.coef_)

plt.figure(figsize=(18, 5))

# Ridge paths
plt.subplot(1, 3, 1)
sns.set_palette("tab10")
for i in range(X.shape[1]):
    plt.plot(alphas, [coef[i] for coef in coefs_ridge], label=f'Feature {i}' if i < 5 else None)
plt.xscale('log')
plt.title('Ridge Coefficient Paths')
plt.xlabel('Alpha (Regularization Strength)')
plt.ylabel('Coefficient Value')
plt.axhline(0, color='grey', linestyle='--', linewidth=1)
plt.legend(loc="upper right", ncol=2, fontsize=8, frameon=False)

# Lasso paths
plt.subplot(1, 3, 2)
for i in range(X.shape[1]):
    plt.plot(alphas, [coef[i] for coef in coefs_lasso], label=f'Feature {i}' if i < 5 else None)
plt.xscale('log')
plt.title('Lasso Coefficient Paths')
plt.xlabel('Alpha (Regularization Strength)')
plt.axhline(0, color='grey', linestyle='--', linewidth=1)

# ElasticNet paths
plt.subplot(1, 3, 3)
for i in range(X.shape[1]):
    plt.plot(alphas, [coef[i] for coef in coefs_enet], label=f'Feature {i}' if i < 5 else None)
plt.xscale('log')
plt.title('ElasticNet Coefficient Paths')
plt.xlabel('Alpha (Regularization Strength)')
plt.axhline(0, color='grey', linestyle='--', linewidth=1)

plt.tight_layout()
plt.show()

Tudo estava claro?

Obrigado pelo seu feedback!

Seção 3. Capítulo 3