Oppiskele Building Pipelines | Pipelines and Composition Patterns

Pyyhkäise näyttääksesi valikon

To streamline machine learning workflows, scikit-learn provides the Pipeline object. A Pipeline chains together a sequence of transformers and a final estimator, allowing you to treat the entire sequence as a single estimator. This means you can combine preprocessing steps (such as scaling or encoding) with your model, making your code more organized, less error-prone, and easier to maintain. By encapsulating multiple steps, a pipeline ensures that transformations are applied consistently during both training and prediction, reducing the risk of data leakage and simplifying cross-validation or grid search procedures.


              123456789101112131415161718192021
            
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris

# Load example data
X, y = load_iris(return_X_y=True)

# Construct a pipeline with scaling and logistic regression
pipeline = Pipeline([
    ("scaler", StandardScaler()),
    ("classifier", LogisticRegression())
])

# Fit the pipeline to the data
pipeline.fit(X, y)

# Predict using the pipeline
predictions = pipeline.predict(X)

print(predictions[:5])

When you use a pipeline, each step is executed in the order you defined. In the example above, the data first passes through the StandardScaler, which standardizes features by removing the mean and scaling to unit variance. The output of the scaler is then passed directly to the LogisticRegression classifier. By calling fit on the pipeline, both the scaler and the classifier are trained sequentially: fit_transform is called on the scaler, and then fit is called on the classifier using the transformed data. Similarly, when you call predict, the input is automatically transformed by the scaler before being passed to the classifier for prediction. This ordered execution ensures that your preprocessing and modeling steps are always applied in a consistent and reproducible way.

Oliko kaikki selvää?

Kiitos palautteestasi!

Osio 3. Luku 1

Kysy tekoälyä

Kysy mitä tahansa tai kokeile jotakin ehdotetuista kysymyksistä aloittaaksesi keskustelumme

Osio 3. Luku 1