GridSearchCV
Swipe to show menu
To improve model performance, we tune hyperparameters. The idea is simple: test different values, compute cross-validation scores, and choose the one with the highest score.
This process can be done using the GridSearchCV class of the sklearn.model_selection module.
GridSearchCV requires a model and a parameter grid (param_grid).
Example:
param_grid = {'n_neighbors': [1, 3, 5, 7]}
After initializing GridSearchCV, call .fit(X, y).
- The best model is in
.best_estimator_; - Its cross-validation score is in
.best_score_.
12345678910111213import pandas as pd from sklearn.neighbors import KNeighborsClassifier from sklearn.model_selection import GridSearchCV df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a65bbc96-309e-4df9-a790-a1eb8c815a1c/penguins_pipelined.csv') X, y = df.drop('species', axis=1), df['species'] param_grid = {'n_neighbors': [1,3,5,7,9]} grid_search = GridSearchCV(KNeighborsClassifier(), param_grid) grid_search.fit(X, y) print(grid_search.best_estimator_) print(grid_search.best_score_)
After fitting, GridSearchCV automatically retrains the best estimator on the full dataset.
The grid_search object becomes the final trained model and can be used directly with .predict() and .score().
12grid_search.fit(X, y) print(grid_search.score(X, y)) # training accuracy (not reliable for real evaluation)
Everything was clear?
Thanks for your feedback!
SectionΒ 4. ChapterΒ 6
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
SectionΒ 4. ChapterΒ 6