Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Challenge: Choosing the Best K Value. | k-NN Classifier
Classification with Python
course content

Kursinnhold

Classification with Python

Classification with Python

1. k-NN Classifier
2. Logistic Regression
3. Decision Tree
4. Random Forest
5. Comparing Models

book
Challenge: Choosing the Best K Value.

As shown in the previous chapters, the model makes different predictions for different k(neighbors number) values.
When we build a model, we want to choose the k that will lead to the best performance. And in the previous chapter, we learned how to measure performance using cross-validation.
Running a loop and calculating cross-validation scores for some range of k values to choose the highest sounds like a no-brainer. And that's the most frequently used approach. sklearn has a neat class for that task.

The param_grid parameter takes a dictionary with parameter names as keys and a list of items to go through as a list. For example, to try values 1-99 for n_neighbors, you would use:

python

The .fit(X, y) method leads the GridSearchCV object to find the best parameters from param_grid and re-train the model with the best parameters using the whole set.
You can then get the highest score using the .best_score_ attribute and predict new values using the .predict() method.

Oppgave

Swipe to start coding

  1. Import the GridSearchCV class.
  2. Scale the X using StandardScaler.
  3. Look for the best value of n_neighbors among [3, 9, 18, 27].
  4. Initialize and train a GridSearchCV object with 4 folds of cross-validation.
  5. Print the score of the best model.

Løsning

Switch to desktopBytt til skrivebordet for virkelighetspraksisFortsett der du er med et av alternativene nedenfor
Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 1. Kapittel 7
toggle bottom row

book
Challenge: Choosing the Best K Value.

As shown in the previous chapters, the model makes different predictions for different k(neighbors number) values.
When we build a model, we want to choose the k that will lead to the best performance. And in the previous chapter, we learned how to measure performance using cross-validation.
Running a loop and calculating cross-validation scores for some range of k values to choose the highest sounds like a no-brainer. And that's the most frequently used approach. sklearn has a neat class for that task.

The param_grid parameter takes a dictionary with parameter names as keys and a list of items to go through as a list. For example, to try values 1-99 for n_neighbors, you would use:

python

The .fit(X, y) method leads the GridSearchCV object to find the best parameters from param_grid and re-train the model with the best parameters using the whole set.
You can then get the highest score using the .best_score_ attribute and predict new values using the .predict() method.

Oppgave

Swipe to start coding

  1. Import the GridSearchCV class.
  2. Scale the X using StandardScaler.
  3. Look for the best value of n_neighbors among [3, 9, 18, 27].
  4. Initialize and train a GridSearchCV object with 4 folds of cross-validation.
  5. Print the score of the best model.

Løsning

Switch to desktopBytt til skrivebordet for virkelighetspraksisFortsett der du er med et av alternativene nedenfor
Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 1. Kapittel 7
Switch to desktopBytt til skrivebordet for virkelighetspraksisFortsett der du er med et av alternativene nedenfor
Vi beklager at noe gikk galt. Hva skjedde?
some-alt