Contenu du cours
ML Introduction with scikit-learn
ML Introduction with scikit-learn
1. Machine Learning Concepts
2. Preprocessing Data with Scikit-learn
Putting It All Together
In this challenge, you will apply everything you learned throughout the course. Here are the steps you need to take:
- Remove the rows that hold too little information;
- Encode the
y
; - Split the dataset into training and test sets;
- Build a pipeline with all the preprocessing steps and the
GridSearchCV
as the final estimator to find the best hyperparameters; - Train the model using the pipeline;
- Evaluate the model using the pipeline;
- Predict the target for
X_new
and decode it using theLabelEncoder
's.inverse_transform()
.
Let's get to w̵o̵r̵k̵ code!





Tâche
Swipe to start coding
- Encode the target using
LabelEncoder
. - Split the data so that 33% is used for a test set and the rest – for a training set.
- Make a
ColumnTransformer
to encode only the'island'
and'sex'
columns. Make the others remain untouched. Use a proper encoder for nominal data. - Fill the gaps in a
param_grid
to try the following values for the number of neighbors:[1, 3, 5, 7, 9, 12, 15, 20, 25]
. - Create a
GridSearchCV
object with theKNeighborsClassifier
as a model. - Make a pipeline with
ct
as a first step andgrid_search
as a final estimator. - Train the model using a pipeline on the training set.
- Evaluate the model on the test set. (Print its score)
- Get a predicted target for
X_test
. - Print the best estimator found by
grid_search
.
Solution
Tout était clair ?
Merci pour vos commentaires !
Section 4. Chapitre 10
Putting It All Together
In this challenge, you will apply everything you learned throughout the course. Here are the steps you need to take:
- Remove the rows that hold too little information;
- Encode the
y
; - Split the dataset into training and test sets;
- Build a pipeline with all the preprocessing steps and the
GridSearchCV
as the final estimator to find the best hyperparameters; - Train the model using the pipeline;
- Evaluate the model using the pipeline;
- Predict the target for
X_new
and decode it using theLabelEncoder
's.inverse_transform()
.
Let's get to w̵o̵r̵k̵ code!





Tâche
Swipe to start coding
- Encode the target using
LabelEncoder
. - Split the data so that 33% is used for a test set and the rest – for a training set.
- Make a
ColumnTransformer
to encode only the'island'
and'sex'
columns. Make the others remain untouched. Use a proper encoder for nominal data. - Fill the gaps in a
param_grid
to try the following values for the number of neighbors:[1, 3, 5, 7, 9, 12, 15, 20, 25]
. - Create a
GridSearchCV
object with theKNeighborsClassifier
as a model. - Make a pipeline with
ct
as a first step andgrid_search
as a final estimator. - Train the model using a pipeline on the training set.
- Evaluate the model on the test set. (Print its score)
- Get a predicted target for
X_test
. - Print the best estimator found by
grid_search
.
Solution
Tout était clair ?
Merci pour vos commentaires !
Section 4. Chapitre 10