Вивчайте GridSearchCV

Свайпніть щоб показати меню

To improve model performance, we tune hyperparameters. The idea is simple: test different values, compute cross-validation scores, and choose the one with the highest score.

This process can be done using the GridSearchCV class of the sklearn.model_selection module.

GridSearchCV requires a model and a parameter grid (param_grid). Example:

param_grid = {'n_neighbors': [1, 3, 5, 7]}

After initializing GridSearchCV, call .fit(X, y).

The best model is in .best_estimator_;
Its cross-validation score is in .best_score_.


              12345678910111213
            
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import GridSearchCV

df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a65bbc96-309e-4df9-a790-a1eb8c815a1c/penguins_pipelined.csv')
X, y = df.drop('species', axis=1), df['species']

param_grid = {'n_neighbors': [1,3,5,7,9]}
grid_search = GridSearchCV(KNeighborsClassifier(), param_grid)
grid_search.fit(X, y)

print(grid_search.best_estimator_)
print(grid_search.best_score_)

After fitting, GridSearchCV automatically retrains the best estimator on the full dataset. The grid_search object becomes the final trained model and can be used directly with .predict() and .score().


              12
            
grid_search.fit(X, y)
print(grid_search.score(X, y))   # training accuracy (not reliable for real evaluation)

Все було зрозуміло?

Дякуємо за ваш відгук!

Секція 1. Розділ 28

Запитати АІ

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

Секція 1. Розділ 28