Challenge: Putting It All Together
In this challenge, you will apply everything you learned throughout the course from data preprocessing to training and evaluating the model.





Task
Swipe to start coding
- Encode the target.
- Split the data so that 33% is used for the test set and the remainder for the training set.
- Make a
ColumnTransformer
to encode only the'island'
and'sex'
columns. Make sure the others columns remain untouched. Use a proper encoder for nominal data. - Fill the gaps in a
param_grid
to try the following values for the number of neighbors:[1, 3, 5, 7, 9, 12, 15, 20, 25]
. - Create a
GridSearchCV
object with theKNeighborsClassifier
as a model. - Construct a pipeline that begins with
ct
as the first step, followed by imputation using the most frequent value, standardization, and concludes withGridSearchCV
as the final estimator. - Train the model using a pipeline on the training set.
- Evaluate the model on the test set. (Print its score)
- Get a predicted target for
X_test
. - Print the best estimator found by
grid_search
.
Solution
Everything was clear?
Thanks for your feedback!
SectionΒ 4. ChapterΒ 10
single
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
Awesome!
Completion rate improved to 3.13
Challenge: Putting It All Together
Swipe to show menu
In this challenge, you will apply everything you learned throughout the course from data preprocessing to training and evaluating the model.





Task
Swipe to start coding
- Encode the target.
- Split the data so that 33% is used for the test set and the remainder for the training set.
- Make a
ColumnTransformer
to encode only the'island'
and'sex'
columns. Make sure the others columns remain untouched. Use a proper encoder for nominal data. - Fill the gaps in a
param_grid
to try the following values for the number of neighbors:[1, 3, 5, 7, 9, 12, 15, 20, 25]
. - Create a
GridSearchCV
object with theKNeighborsClassifier
as a model. - Construct a pipeline that begins with
ct
as the first step, followed by imputation using the most frequent value, standardization, and concludes withGridSearchCV
as the final estimator. - Train the model using a pipeline on the training set.
- Evaluate the model on the test set. (Print its score)
- Get a predicted target for
X_test
. - Print the best estimator found by
grid_search
.
Solution
Everything was clear?
Thanks for your feedback!
Awesome!
Completion rate improved to 3.13SectionΒ 4. ChapterΒ 10
single