Зміст курсу
ML Introduction with scikit-learn
1. Machine Learning Concepts
2. Preprocessing Data with Scikit-learn
ML Introduction with scikit-learn
Pipeline
Now that you know how to transform columns separately using the make_column_transformer
function, you are well-equipped to create pipelines!
As a reminder, a pipeline is a container for your preprocessing steps, that can apply them sequentially.
![](https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a65bbc96-309e-4df9-a790-a1eb8c815a1c/PipelineImage.png)
To create a pipeline using Scikit-learn, you can either use a Pipeline
class constructor or a make_pipeline
function, both from the sklearn.pipeline
module.
In this course, we will focus on the second approach since it is easier to use.
![](https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a65bbc96-309e-4df9-a790-a1eb8c815a1c/MakePipelineFunc.png)
You just need to pass all the transformers as arguments to a function. Creating pipelines is that simple.
However, when you call the .fit_transform(X)
method on the Pipeline
object, it applies .fit_transform(X)
to every transformer inside the pipeline, so if you want to treat some columns differently, then you should use a ColumnTransformer
and pass it to make_pipeline()
.
![](https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a65bbc96-309e-4df9-a790-a1eb8c815a1c/applies_to_all.gif)
Let's code! We will use the same file as in the previous chapter.
We want to build a pipeline containing encoders for categorical features and SimpleImputer
. There are both nominal and ordinal, so we need to use a ColumnTransformer
to encode them separately. We have already done it in the previous chapter.
Все було зрозуміло?
Зміст курсу
ML Introduction with scikit-learn
1. Machine Learning Concepts
2. Preprocessing Data with Scikit-learn
ML Introduction with scikit-learn
Pipeline
Now that you know how to transform columns separately using the make_column_transformer
function, you are well-equipped to create pipelines!
As a reminder, a pipeline is a container for your preprocessing steps, that can apply them sequentially.
![](https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a65bbc96-309e-4df9-a790-a1eb8c815a1c/PipelineImage.png)
To create a pipeline using Scikit-learn, you can either use a Pipeline
class constructor or a make_pipeline
function, both from the sklearn.pipeline
module.
In this course, we will focus on the second approach since it is easier to use.
![](https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a65bbc96-309e-4df9-a790-a1eb8c815a1c/MakePipelineFunc.png)
You just need to pass all the transformers as arguments to a function. Creating pipelines is that simple.
However, when you call the .fit_transform(X)
method on the Pipeline
object, it applies .fit_transform(X)
to every transformer inside the pipeline, so if you want to treat some columns differently, then you should use a ColumnTransformer
and pass it to make_pipeline()
.
![](https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a65bbc96-309e-4df9-a790-a1eb8c815a1c/applies_to_all.gif)
Let's code! We will use the same file as in the previous chapter.
We want to build a pipeline containing encoders for categorical features and SimpleImputer
. There are both nominal and ordinal, so we need to use a ColumnTransformer
to encode them separately. We have already done it in the previous chapter.
Все було зрозуміло?