Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Putting It All Together | Modeling
ML Introduction with scikit-learn

Putting It All TogetherPutting It All Together

In this challenge, you will apply everything you learned throughout the course. Here are the steps you need to take:

  1. Remove the rows that hold too little information;
  2. Encode the y;
  3. Split the dataset into training and test sets;
  4. Build a pipeline with all the preprocessing steps and the GridSearchCV as the final estimator to find the best hyperparameters;
  5. Train the model using the pipeline;
  6. Evaluate the model using the pipeline;
  7. Predict the target for X_new and decode it using the LabelEncoder's .inverse_transform().

Let's get to w̵o̵r̵k̵ code!

carousel-imgcarousel-imgcarousel-imgcarousel-imgcarousel-img

Tarefa

  1. Encode the target using LabelEncoder.
  2. Split the data so that 33% is used for a test set and the rest – for a training set.
  3. Make a ColumnTransformer to encode only the 'island' and 'sex' columns. Make the others remain untouched. Use a proper encoder for nominal data.
  4. Fill the gaps in a param_grid to try the following values for the number of neighbors: [1, 3, 5, 7, 9, 12, 15, 20, 25].
  5. Create a GridSearchCV object with the KNeighborsClassifier as a model.
  6. Make a pipeline with ct as a first step and grid_search as a final estimator.
  7. Train the model using a pipeline on the training set.
  8. Evaluate the model on the test set. (Print its score)
  9. Get a predicted target for X_test.
  10. Print the best estimator found by grid_search.

Tudo estava claro?

Seção 4. Capítulo 10
toggle bottom row
course content

Conteúdo do Curso

ML Introduction with scikit-learn

Putting It All TogetherPutting It All Together

In this challenge, you will apply everything you learned throughout the course. Here are the steps you need to take:

  1. Remove the rows that hold too little information;
  2. Encode the y;
  3. Split the dataset into training and test sets;
  4. Build a pipeline with all the preprocessing steps and the GridSearchCV as the final estimator to find the best hyperparameters;
  5. Train the model using the pipeline;
  6. Evaluate the model using the pipeline;
  7. Predict the target for X_new and decode it using the LabelEncoder's .inverse_transform().

Let's get to w̵o̵r̵k̵ code!

carousel-imgcarousel-imgcarousel-imgcarousel-imgcarousel-img

Tarefa

  1. Encode the target using LabelEncoder.
  2. Split the data so that 33% is used for a test set and the rest – for a training set.
  3. Make a ColumnTransformer to encode only the 'island' and 'sex' columns. Make the others remain untouched. Use a proper encoder for nominal data.
  4. Fill the gaps in a param_grid to try the following values for the number of neighbors: [1, 3, 5, 7, 9, 12, 15, 20, 25].
  5. Create a GridSearchCV object with the KNeighborsClassifier as a model.
  6. Make a pipeline with ct as a first step and grid_search as a final estimator.
  7. Train the model using a pipeline on the training set.
  8. Evaluate the model on the test set. (Print its score)
  9. Get a predicted target for X_test.
  10. Print the best estimator found by grid_search.

Tudo estava claro?

Seção 4. Capítulo 10
toggle bottom row
some-alt