Challenge: Choosing the Best K Value.
As shown in the previous chapters, the model makes different predictions for different k(neighbors number) values.
When we build a model, we want to choose the k that will lead to the best performance. And in the previous chapter, we learned how to measure performance using cross-validation.
Running a loop and calculating cross-validation scores for some range of k values to choose the highest sounds like a no-brainer. And that's the most frequently used approach. sklearn
has a neat class for that task.
The param_grid
parameter takes a dictionary with parameter names as keys and a list of items to go through as a list. For example, to try values 1-99 for n_neighbors
, you would use:
param_grid = {'n_neighbors': range(1, 100)}
The .fit(X, y)
method leads the GridSearchCV
object to find the best parameters from param_grid
and re-train the model with the best parameters using the whole set.
You can then get the highest score using the .best_score_
attribute and predict new values using the .predict()
method.
Swipe to start coding
- Import the
GridSearchCV
class. - Scale the
X
usingStandardScaler
. - Look for the best value of
n_neighbors
among[3, 9, 18, 27]
. - Initialize and train a
GridSearchCV
object with 4 folds of cross-validation. - Print the score of the best model.
Lösning
Tack för dina kommentarer!
single
Fråga AI
Fråga AI
Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal
Awesome!
Completion rate improved to 3.57Awesome!
Completion rate improved to 3.57
Challenge: Choosing the Best K Value.
As shown in the previous chapters, the model makes different predictions for different k(neighbors number) values.
When we build a model, we want to choose the k that will lead to the best performance. And in the previous chapter, we learned how to measure performance using cross-validation.
Running a loop and calculating cross-validation scores for some range of k values to choose the highest sounds like a no-brainer. And that's the most frequently used approach. sklearn
has a neat class for that task.
The param_grid
parameter takes a dictionary with parameter names as keys and a list of items to go through as a list. For example, to try values 1-99 for n_neighbors
, you would use:
param_grid = {'n_neighbors': range(1, 100)}
The .fit(X, y)
method leads the GridSearchCV
object to find the best parameters from param_grid
and re-train the model with the best parameters using the whole set.
You can then get the highest score using the .best_score_
attribute and predict new values using the .predict()
method.
Swipe to start coding
- Import the
GridSearchCV
class. - Scale the
X
usingStandardScaler
. - Look for the best value of
n_neighbors
among[3, 9, 18, 27]
. - Initialize and train a
GridSearchCV
object with 4 folds of cross-validation. - Print the score of the best model.
Lösning
Tack för dina kommentarer!
single
Awesome!
Completion rate improved to 3.57
Challenge: Choosing the Best K Value.
Svep för att visa menyn
As shown in the previous chapters, the model makes different predictions for different k(neighbors number) values.
When we build a model, we want to choose the k that will lead to the best performance. And in the previous chapter, we learned how to measure performance using cross-validation.
Running a loop and calculating cross-validation scores for some range of k values to choose the highest sounds like a no-brainer. And that's the most frequently used approach. sklearn
has a neat class for that task.
The param_grid
parameter takes a dictionary with parameter names as keys and a list of items to go through as a list. For example, to try values 1-99 for n_neighbors
, you would use:
param_grid = {'n_neighbors': range(1, 100)}
The .fit(X, y)
method leads the GridSearchCV
object to find the best parameters from param_grid
and re-train the model with the best parameters using the whole set.
You can then get the highest score using the .best_score_
attribute and predict new values using the .predict()
method.
Swipe to start coding
- Import the
GridSearchCV
class. - Scale the
X
usingStandardScaler
. - Look for the best value of
n_neighbors
among[3, 9, 18, 27]
. - Initialize and train a
GridSearchCV
object with 4 folds of cross-validation. - Print the score of the best model.
Lösning
Tack för dina kommentarer!