Aprenda Estimator Introspection and Parameter Management | Introspection, Reproducibility, and Anti-Patterns

Deslize para mostrar o menu

When working with scikit-learn estimators, you often need to inspect or modify their configuration. This is where the get_params and set_params methods become essential. These methods are available on all scikit-learn estimators, including pipelines and transformers, and provide a consistent way to access or update the internal parameters that control estimator behavior. get_params allows you to retrieve the current values of all parameters, including those nested within pipelines or composite estimators. set_params lets you update one or more parameters after an estimator has been created, making it easy to tweak configurations without reconstructing the entire object. This approach is especially useful when you are experimenting with different settings or automating workflows.


              123456789101112131415161718192021222324
            
import numpy as np
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression

# Create a pipeline as in earlier examples
pipe = Pipeline([
    ("scaler", StandardScaler()),
    ("clf", LogisticRegression(C=1.0, max_iter=100))
])

# Retrieve all parameters, including nested estimator parameters
params = pipe.get_params()
print("Original pipeline parameters:")
print({k: params[k] for k in list(params)[:5]})  # Display a sample

# Update the regularization strength and scaler's with_mean parameter
pipe.set_params(clf__C=0.5, scaler__with_mean=False)

# Check that the parameters were updated
updated_params = pipe.get_params()
print("\nUpdated pipeline parameters:")
print(f"clf__C: {updated_params['clf__C']}")
print(f"scaler__with_mean: {updated_params['scaler__with_mean']}")

Parameter management using get_params and set_params is a cornerstone of reproducible machine learning workflows. By programmatically inspecting and updating estimator configurations, you can precisely record the settings used in each experiment, ensuring that results can be replicated later or shared with collaborators. This approach also simplifies experiment tracking, as parameter dictionaries can be logged, versioned, or integrated with automated tools. Ultimately, robust parameter management helps you avoid silent configuration drift and supports reliable, transparent model development.

Tudo estava claro?

Obrigado pelo seu feedback!

Seção 5. Capítulo 1

Pergunte à IA

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo

Seção 5. Capítulo 1