Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Apprendre Grid Search with GridSearchCV | Manual and Search-Based Tuning Methods
Hyperparameter Tuning Basics

bookGrid Search with GridSearchCV

Grid search is a systematic approach for hyperparameter tuning where you exhaustively evaluate every possible combination of specified hyperparameter values. This method ensures that you explore all options within your defined space, which can be especially useful when you want to avoid missing a potentially optimal configuration. In practice, grid search can quickly become tedious and computationally expensive if performed manually, especially as the number of hyperparameters and their candidate values increases. To address this, scikit-learn provides an automated tool called GridSearchCV that handles the exhaustive search and evaluation process efficiently.

Note
Definition

Grid search is a method that evaluates all possible combinations of specified hyperparameter values. This approach ensures that every configuration in your parameter grid is considered during model tuning.

Note
Definition

Parameter grid refers to a dictionary that specifies the hyperparameters and their candidate values for search. Each key in the dictionary is the name of a hyperparameter, and each value is a list of possible values to test during grid search.

Note
Definition

Cross-validation is a technique for assessing model performance by splitting data into multiple train/test sets. This approach helps you obtain a more reliable estimate of how your model will perform on unseen data.

To automate grid search, use scikit-learn's GridSearchCV with a support vector classifier (SVC). Start by defining a parameter grid as a dictionary: each key is a hyperparameter name, and its value is a list of candidate values to test. GridSearchCV evaluates every combination of these values using cross-validation, automatically finding the best set of hyperparameters based on a scoring metric such as accuracy.

1234567891011121314151617181920212223242526272829303132
from sklearn.datasets import make_moons from sklearn.ensemble import RandomForestClassifier from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score # Generate a more challenging dataset X, y = make_moons(n_samples=1000, noise=0.35, random_state=42) # Train-test split X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.3, random_state=42 ) # Default Random Forest clf_default = RandomForestClassifier(random_state=42) clf_default.fit(X_train, y_train) acc_default = accuracy_score(y_test, clf_default.predict(X_test)) print(f"Default (n_estimators=100, max_depth=None): {acc_default:.3f}") # Tuned Random Forest clf_tuned = RandomForestClassifier( n_estimators=300, max_depth=6, min_samples_split=3, min_samples_leaf=2, random_state=42 ) clf_tuned.fit(X_train, y_train) acc_tuned = accuracy_score(y_test, clf_tuned.predict(X_test)) print(f"Tuned (n_estimators=300, max_depth=6): {acc_tuned:.3f}") print(f"Improvement: +{acc_tuned - acc_default:.3f}")
copy

By automating the search and evaluation process, GridSearchCV saves you from having to manually train and compare models for every parameter combination. This not only improves efficiency but also reduces the risk of human error in the tuning workflow.

question mark

What is the primary advantage of using GridSearchCV over manual tuning?

Select the correct answer

Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 2. Chapitre 2

Demandez à l'IA

expand

Demandez à l'IA

ChatGPT

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

Suggested prompts:

Can you explain how to set up a parameter grid for GridSearchCV?

What are the main advantages and disadvantages of using grid search?

How does GridSearchCV determine the best hyperparameters?

Awesome!

Completion rate improved to 9.09

bookGrid Search with GridSearchCV

Glissez pour afficher le menu

Grid search is a systematic approach for hyperparameter tuning where you exhaustively evaluate every possible combination of specified hyperparameter values. This method ensures that you explore all options within your defined space, which can be especially useful when you want to avoid missing a potentially optimal configuration. In practice, grid search can quickly become tedious and computationally expensive if performed manually, especially as the number of hyperparameters and their candidate values increases. To address this, scikit-learn provides an automated tool called GridSearchCV that handles the exhaustive search and evaluation process efficiently.

Note
Definition

Grid search is a method that evaluates all possible combinations of specified hyperparameter values. This approach ensures that every configuration in your parameter grid is considered during model tuning.

Note
Definition

Parameter grid refers to a dictionary that specifies the hyperparameters and their candidate values for search. Each key in the dictionary is the name of a hyperparameter, and each value is a list of possible values to test during grid search.

Note
Definition

Cross-validation is a technique for assessing model performance by splitting data into multiple train/test sets. This approach helps you obtain a more reliable estimate of how your model will perform on unseen data.

To automate grid search, use scikit-learn's GridSearchCV with a support vector classifier (SVC). Start by defining a parameter grid as a dictionary: each key is a hyperparameter name, and its value is a list of candidate values to test. GridSearchCV evaluates every combination of these values using cross-validation, automatically finding the best set of hyperparameters based on a scoring metric such as accuracy.

1234567891011121314151617181920212223242526272829303132
from sklearn.datasets import make_moons from sklearn.ensemble import RandomForestClassifier from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score # Generate a more challenging dataset X, y = make_moons(n_samples=1000, noise=0.35, random_state=42) # Train-test split X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.3, random_state=42 ) # Default Random Forest clf_default = RandomForestClassifier(random_state=42) clf_default.fit(X_train, y_train) acc_default = accuracy_score(y_test, clf_default.predict(X_test)) print(f"Default (n_estimators=100, max_depth=None): {acc_default:.3f}") # Tuned Random Forest clf_tuned = RandomForestClassifier( n_estimators=300, max_depth=6, min_samples_split=3, min_samples_leaf=2, random_state=42 ) clf_tuned.fit(X_train, y_train) acc_tuned = accuracy_score(y_test, clf_tuned.predict(X_test)) print(f"Tuned (n_estimators=300, max_depth=6): {acc_tuned:.3f}") print(f"Improvement: +{acc_tuned - acc_default:.3f}")
copy

By automating the search and evaluation process, GridSearchCV saves you from having to manually train and compare models for every parameter combination. This not only improves efficiency but also reduces the risk of human error in the tuning workflow.

question mark

What is the primary advantage of using GridSearchCV over manual tuning?

Select the correct answer

Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 2. Chapitre 2
some-alt