Conteúdo do Curso
ML Introduction with scikit-learn
ML Introduction with scikit-learn
Creating a Pipeline Challenge
In this challenge, you need to put all the preprocessing steps we did together into one pipeline.
The dataset is the initial penguins.csv
file we started from.
The first step is to remove two useless rows.
Then you will have to create a pipeline containing Encoding, Imputing, and Scaling.
You only need to encode two columns, 'sex'
and 'island'
. Since you do not want to encode the whole X
, you must use ColumnTransformer
. Then apply the SimpleImputer
and StandardScaler
to the entire X
.
Here is a reminder of the make_column_transformer()
and make_pipeline()
functions you will use.
Swipe to show code editor
- Import the correct function for making a pipeline.
- Make a
ColumnTransformer
with theOneHotEncoder
applied only to columns'sex'
and'island'
. - Set a
remainder
argument ofmake_column_transformer
so that all the numerical columns remain untouched. - Make a pipeline containing
ct
you just created,SimpleImputer
with thestrategy
of'most_frequent'
and aStandardScaler
as a last step. - Transform the
X
using thepipe
you created.
Solução
Obrigado pelo seu feedback!
Creating a Pipeline Challenge
In this challenge, you need to put all the preprocessing steps we did together into one pipeline.
The dataset is the initial penguins.csv
file we started from.
The first step is to remove two useless rows.
Then you will have to create a pipeline containing Encoding, Imputing, and Scaling.
You only need to encode two columns, 'sex'
and 'island'
. Since you do not want to encode the whole X
, you must use ColumnTransformer
. Then apply the SimpleImputer
and StandardScaler
to the entire X
.
Here is a reminder of the make_column_transformer()
and make_pipeline()
functions you will use.
Swipe to show code editor
- Import the correct function for making a pipeline.
- Make a
ColumnTransformer
with theOneHotEncoder
applied only to columns'sex'
and'island'
. - Set a
remainder
argument ofmake_column_transformer
so that all the numerical columns remain untouched. - Make a pipeline containing
ct
you just created,SimpleImputer
with thestrategy
of'most_frequent'
and aStandardScaler
as a last step. - Transform the
X
using thepipe
you created.
Solução
Obrigado pelo seu feedback!