Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Вивчайте Grid Search and Hyperparameter Tuning | Model Selection and Evaluation Utilities
Mastering scikit-learn API and Workflows

bookGrid Search and Hyperparameter Tuning

Grid search is a systematic approach for hyperparameter tuning in machine learning, and GridSearchCV is scikit-learn’s core utility for this task. The main purpose of GridSearchCV is to automate the process of searching over specified parameter values for an estimator, such as a classifier or regressor, to find the combination that yields the best cross-validated performance. You specify a parameter grid—essentially a dictionary mapping parameter names to lists of values to try—and GridSearchCV evaluates all possible combinations using cross-validation. This ensures a thorough and unbiased search across the hyperparameter space, reducing the risk of overfitting to a single validation set.

The typical workflow involves defining your estimator or pipeline, preparing the parameter grid, and then constructing a GridSearchCV object. After fitting, you can retrieve the best parameters and estimator found during the search. This approach is especially powerful when combined with pipelines, as it allows you to tune preprocessing steps and model hyperparameters simultaneously by referencing parameters using a double-underscore (__) notation.

1234567891011121314151617181920212223242526272829303132
import numpy as np from sklearn.datasets import load_iris from sklearn.model_selection import GridSearchCV from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.svm import SVC # Load dataset X, y = load_iris(return_X_y=True) # Define a pipeline with preprocessing and model pipe = Pipeline([ ('scaler', StandardScaler()), ('svc', SVC()) ]) # Define parameter grid for grid search param_grid = { 'svc__C': [0.1, 1, 10], 'svc__kernel': ['linear', 'rbf'], 'svc__gamma': ['scale', 'auto'] } # Set up GridSearchCV grid = GridSearchCV(pipe, param_grid, cv=5, scoring='accuracy', n_jobs=-1) # Fit the grid search to the data grid.fit(X, y) # Access the best parameters and score print("Best parameters:", grid.best_params_) print("Best cross-validated accuracy: {:.3f}".format(grid.best_score_))
copy

GridSearchCV integrates seamlessly into the scikit-learn workflow. You can use it wherever you would use a regular estimator: fit it to your training data, predict on new samples, and score its performance. After fitting, you can access the best set of parameters found via the best_params_ attribute, and the optimized estimator itself via best_estimator_. This makes it easy to deploy the tuned model or analyze which parameter settings performed best, supporting reproducible and robust model selection in your projects.

question mark

What is the main purpose of GridSearchCV in scikit-learn?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 4. Розділ 2

Запитати АІ

expand

Запитати АІ

ChatGPT

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

bookGrid Search and Hyperparameter Tuning

Свайпніть щоб показати меню

Grid search is a systematic approach for hyperparameter tuning in machine learning, and GridSearchCV is scikit-learn’s core utility for this task. The main purpose of GridSearchCV is to automate the process of searching over specified parameter values for an estimator, such as a classifier or regressor, to find the combination that yields the best cross-validated performance. You specify a parameter grid—essentially a dictionary mapping parameter names to lists of values to try—and GridSearchCV evaluates all possible combinations using cross-validation. This ensures a thorough and unbiased search across the hyperparameter space, reducing the risk of overfitting to a single validation set.

The typical workflow involves defining your estimator or pipeline, preparing the parameter grid, and then constructing a GridSearchCV object. After fitting, you can retrieve the best parameters and estimator found during the search. This approach is especially powerful when combined with pipelines, as it allows you to tune preprocessing steps and model hyperparameters simultaneously by referencing parameters using a double-underscore (__) notation.

1234567891011121314151617181920212223242526272829303132
import numpy as np from sklearn.datasets import load_iris from sklearn.model_selection import GridSearchCV from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.svm import SVC # Load dataset X, y = load_iris(return_X_y=True) # Define a pipeline with preprocessing and model pipe = Pipeline([ ('scaler', StandardScaler()), ('svc', SVC()) ]) # Define parameter grid for grid search param_grid = { 'svc__C': [0.1, 1, 10], 'svc__kernel': ['linear', 'rbf'], 'svc__gamma': ['scale', 'auto'] } # Set up GridSearchCV grid = GridSearchCV(pipe, param_grid, cv=5, scoring='accuracy', n_jobs=-1) # Fit the grid search to the data grid.fit(X, y) # Access the best parameters and score print("Best parameters:", grid.best_params_) print("Best cross-validated accuracy: {:.3f}".format(grid.best_score_))
copy

GridSearchCV integrates seamlessly into the scikit-learn workflow. You can use it wherever you would use a regular estimator: fit it to your training data, predict on new samples, and score its performance. After fitting, you can access the best set of parameters found via the best_params_ attribute, and the optimized estimator itself via best_estimator_. This makes it easy to deploy the tuned model or analyze which parameter settings performed best, supporting reproducible and robust model selection in your projects.

question mark

What is the main purpose of GridSearchCV in scikit-learn?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 4. Розділ 2
some-alt