Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Challenge: Choosing the Best K Value. | k-NN Classifier
Classification with Python

Swipe to show menu

book
Challenge: Choosing the Best K Value.

As shown in the previous chapters, the model makes different predictions for different k(neighbors number) values.
When we build a model, we want to choose the k that will lead to the best performance. And in the previous chapter, we learned how to measure performance using cross-validation.
Running a loop and calculating cross-validation scores for some range of k values to choose the highest sounds like a no-brainer. And that's the most frequently used approach. sklearn has a neat class for that task.

The param_grid parameter takes a dictionary with parameter names as keys and a list of items to go through as a list. For example, to try values 1-99 for n_neighbors, you would use:

python

The .fit(X, y) method leads the GridSearchCV object to find the best parameters from param_grid and re-train the model with the best parameters using the whole set.
You can then get the highest score using the .best_score_ attribute and predict new values using the .predict() method.

Task

Swipe to start coding

  1. Import the GridSearchCV class.
  2. Scale the X using StandardScaler.
  3. Look for the best value of n_neighbors among [3, 9, 18, 27].
  4. Initialize and train a GridSearchCV object with 4 folds of cross-validation.
  5. Print the score of the best model.

Solution

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 7
single

single

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

close

Awesome!

Completion rate improved to 3.57

book
Challenge: Choosing the Best K Value.

As shown in the previous chapters, the model makes different predictions for different k(neighbors number) values.
When we build a model, we want to choose the k that will lead to the best performance. And in the previous chapter, we learned how to measure performance using cross-validation.
Running a loop and calculating cross-validation scores for some range of k values to choose the highest sounds like a no-brainer. And that's the most frequently used approach. sklearn has a neat class for that task.

The param_grid parameter takes a dictionary with parameter names as keys and a list of items to go through as a list. For example, to try values 1-99 for n_neighbors, you would use:

python

The .fit(X, y) method leads the GridSearchCV object to find the best parameters from param_grid and re-train the model with the best parameters using the whole set.
You can then get the highest score using the .best_score_ attribute and predict new values using the .predict() method.

Task

Swipe to start coding

  1. Import the GridSearchCV class.
  2. Scale the X using StandardScaler.
  3. Look for the best value of n_neighbors among [3, 9, 18, 27].
  4. Initialize and train a GridSearchCV object with 4 folds of cross-validation.
  5. Print the score of the best model.

Solution

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

close

Awesome!

Completion rate improved to 3.57

Swipe to show menu

some-alt