Apprendre Why Tuning Matters for Performance | Introduction to Hyperparameter Tuning

When you train a machine learning model, the choices you make for its hyperparameters can make a dramatic difference in how well it performs. Hyperparameters are the settings you specify before training begins—such as the learning rate, regularization strength, or kernel parameters for a support vector classifier (SVC). If you leave these at their default values, you may not get the best possible results. In fact, the right combination of hyperparameters can mean the difference between a model that is highly accurate and one that fails to generalize to new data. Tuning hyperparameters is not just about squeezing out a few extra percentage points of accuracy; it is about ensuring your model is robust and can handle data it has never seen before.

Definition

Generalization refers to a model's ability to perform well on unseen data, not just the training set. This means the model can make accurate predictions on new inputs it has not encountered before, which is the ultimate goal of machine learning.

To see this in action, you can compare the performance of an SVC using its default hyperparameters to the performance after tuning two important hyperparameters: C and gamma. The C parameter controls the trade-off between achieving a low training error and a low testing error, while gamma defines how far the influence of a single training example reaches. Choosing these values carefully can greatly improve your model's generalization.


              123456789101112131415161718
            
from sklearn.datasets import make_moons
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X, y = make_moons(noise=0.3, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

knn_default = KNeighborsClassifier()
knn_default.fit(X_train, y_train)
acc_def = accuracy_score(y_test, knn_default.predict(X_test))

knn_tuned = KNeighborsClassifier(n_neighbors=15, weights="distance")
knn_tuned.fit(X_train, y_train)
acc_tuned = accuracy_score(y_test, knn_tuned.predict(X_test))

print(f"Default: {acc_def:.2f}")
print(f"Tuned:   {acc_tuned:.2f}")

This example demonstrates how a simple change in hyperparameters can lead to a significant improvement in accuracy on the test set. The default values may not be optimal for your specific data, so tuning is essential for maximizing performance and ensuring your model is ready for real-world data.

Tout était clair ?

Merci pour vos commentaires !

Section 1. Chapitre 2

Demandez à l'IA

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

Awesome!

Completion rate improved to 9.09

Glissez pour afficher le menu

Definition


              123456789101112131415161718
            
from sklearn.datasets import make_moons
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X, y = make_moons(noise=0.3, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

knn_default = KNeighborsClassifier()
knn_default.fit(X_train, y_train)
acc_def = accuracy_score(y_test, knn_default.predict(X_test))

knn_tuned = KNeighborsClassifier(n_neighbors=15, weights="distance")
knn_tuned.fit(X_train, y_train)
acc_tuned = accuracy_score(y_test, knn_tuned.predict(X_test))

print(f"Default: {acc_def:.2f}")
print(f"Tuned:   {acc_tuned:.2f}")

Tout était clair ?

Merci pour vos commentaires !

Section 1. Chapitre 2