Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Challenge: Tuning Hyperparameters with RandomizedSearchCV | Modeling
ML Introduction with scikit-learn

bookChallenge: Tuning Hyperparameters with RandomizedSearchCV

The principle of RandomizedSearchCV is similar to GridSearchCV, but instead of testing every possible combination, it evaluates only a randomly sampled subset.

For instance, the following param_grid contains 100 combinations:

param_grid = {
    'n_neighbors': [1, 3, 5, 7, 9, 12, 15, 17, 20, 25],
    'weights': ['distance', 'uniform'],
    'p': [1, 2, 3, 4, 5]
}

GridSearchCV would test all 100, which is time-consuming. RandomizedSearchCV can instead evaluate a smaller subset, e.g., 20 randomly chosen combinations. This reduces computation time and usually produces results close to the best.

The number of combinations to test is controlled by the n_iter argument (default is 10). Other than that, usage is the same as with GridSearchCV.

Task

Swipe to start coding

You are given a preprocessed penguin dataset ready for model training. Your goal is to tune the hyperparameters of a KNeighborsClassifier model using both grid search and randomized search methods.

  1. Define the parameter grid named param_grid with the desired values for n_neighbors, weights, and p.
  2. Initialize a RandomizedSearchCV object using the defined parameter grid, set n_iter=20.
  3. Initialize a GridSearchCV object using the same parameter grid.
  4. Train both search objects on the dataset using the .fit(X, y) method.
  5. Print the best estimator from the grid search using .best_estimator_.
  6. Print the best cross-validation score from the randomized search using .best_score_.

Solution

Note
Note

You can try running the code several times. Look at the difference between the two scores. Sometimes the scores can be the same due to the presence of the best parameters among combinations sampled by RandomizedSearchCV.

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 4. ChapterΒ 8
single

single

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain when to use RandomizedSearchCV instead of GridSearchCV?

How do I choose the right value for n_iter in RandomizedSearchCV?

What are the main advantages and disadvantages of RandomizedSearchCV?

close

Awesome!

Completion rate improved to 3.13

bookChallenge: Tuning Hyperparameters with RandomizedSearchCV

Swipe to show menu

The principle of RandomizedSearchCV is similar to GridSearchCV, but instead of testing every possible combination, it evaluates only a randomly sampled subset.

For instance, the following param_grid contains 100 combinations:

param_grid = {
    'n_neighbors': [1, 3, 5, 7, 9, 12, 15, 17, 20, 25],
    'weights': ['distance', 'uniform'],
    'p': [1, 2, 3, 4, 5]
}

GridSearchCV would test all 100, which is time-consuming. RandomizedSearchCV can instead evaluate a smaller subset, e.g., 20 randomly chosen combinations. This reduces computation time and usually produces results close to the best.

The number of combinations to test is controlled by the n_iter argument (default is 10). Other than that, usage is the same as with GridSearchCV.

Task

Swipe to start coding

You are given a preprocessed penguin dataset ready for model training. Your goal is to tune the hyperparameters of a KNeighborsClassifier model using both grid search and randomized search methods.

  1. Define the parameter grid named param_grid with the desired values for n_neighbors, weights, and p.
  2. Initialize a RandomizedSearchCV object using the defined parameter grid, set n_iter=20.
  3. Initialize a GridSearchCV object using the same parameter grid.
  4. Train both search objects on the dataset using the .fit(X, y) method.
  5. Print the best estimator from the grid search using .best_estimator_.
  6. Print the best cross-validation score from the randomized search using .best_score_.

Solution

Note
Note

You can try running the code several times. Look at the difference between the two scores. Sometimes the scores can be the same due to the presence of the best parameters among combinations sampled by RandomizedSearchCV.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 4. ChapterΒ 8
single

single

some-alt