Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära Challenge: Choosing the Best K Value. | k-NN Classifier
Classification with Python
course content

Kursinnehåll

Classification with Python

Classification with Python

1. k-NN Classifier
2. Logistic Regression
3. Decision Tree
4. Random Forest
5. Comparing Models

book
Challenge: Choosing the Best K Value.

As shown in the previous chapters, the model makes different predictions for different k(neighbors number) values.
When we build a model, we want to choose the k that will lead to the best performance. And in the previous chapter, we learned how to measure performance using cross-validation.
Running a loop and calculating cross-validation scores for some range of k values to choose the highest sounds like a no-brainer. And that's the most frequently used approach. sklearn has a neat class for that task.

The param_grid parameter takes a dictionary with parameter names as keys and a list of items to go through as a list. For example, to try values 1-99 for n_neighbors, you would use:

python

The .fit(X, y) method leads the GridSearchCV object to find the best parameters from param_grid and re-train the model with the best parameters using the whole set.
You can then get the highest score using the .best_score_ attribute and predict new values using the .predict() method.

Uppgift

Swipe to start coding

  1. Import the GridSearchCV class.
  2. Scale the X using StandardScaler.
  3. Look for the best value of n_neighbors among [3, 9, 18, 27].
  4. Initialize and train a GridSearchCV object with 4 folds of cross-validation.
  5. Print the score of the best model.

Lösning

Switch to desktopByt till skrivbordet för praktisk övningFortsätt där du är med ett av alternativen nedan
Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 1. Kapitel 7
toggle bottom row

book
Challenge: Choosing the Best K Value.

As shown in the previous chapters, the model makes different predictions for different k(neighbors number) values.
When we build a model, we want to choose the k that will lead to the best performance. And in the previous chapter, we learned how to measure performance using cross-validation.
Running a loop and calculating cross-validation scores for some range of k values to choose the highest sounds like a no-brainer. And that's the most frequently used approach. sklearn has a neat class for that task.

The param_grid parameter takes a dictionary with parameter names as keys and a list of items to go through as a list. For example, to try values 1-99 for n_neighbors, you would use:

python

The .fit(X, y) method leads the GridSearchCV object to find the best parameters from param_grid and re-train the model with the best parameters using the whole set.
You can then get the highest score using the .best_score_ attribute and predict new values using the .predict() method.

Uppgift

Swipe to start coding

  1. Import the GridSearchCV class.
  2. Scale the X using StandardScaler.
  3. Look for the best value of n_neighbors among [3, 9, 18, 27].
  4. Initialize and train a GridSearchCV object with 4 folds of cross-validation.
  5. Print the score of the best model.

Lösning

Switch to desktopByt till skrivbordet för praktisk övningFortsätt där du är med ett av alternativen nedan
Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 1. Kapitel 7
Switch to desktopByt till skrivbordet för praktisk övningFortsätt där du är med ett av alternativen nedan
Vi beklagar att något gick fel. Vad hände?
some-alt