Aprende Hyperparameter Tuning

Hyperparameters in Neural Networks

Neural networks, including perceptrons, have several hyperparameters that influence their performance. Unlike model parameters (e.g., weights and biases), hyperparameters are set before training begins. Some key hyperparameters in perceptrons include:

Number of hidden layers and neurons per layer: determines the model's capacity to learn complex patterns. Too few neurons can lead to underfitting, while too many can cause overfitting;
Learning rate: controls how much the model adjusts weights during training. A high learning rate can make training unstable, while a low one may lead to slow convergence:

Number of training epochs: defines how many times the model sees the training data. More epochs allow better learning but may lead to overfitting if excessive.

Hyperparameter Tuning

Hyperparameter tuning is crucial for optimizing neural networks. A poorly tuned model can result in underfitting or overfitting.

You can tweak the number of epochs, the number of hidden layers, their size, and the learning rate to observe how the accuracy on the train and test sets changes:


              1234567891011121314151617181920212223
            
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
import numpy as np
import warnings
# Ignore warnings
warnings.filterwarnings("ignore")
import os
os.system('wget https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f9fc718f-c98b-470d-ba78-d84ef16ba45f/section_2/data.py 2>/dev/null')
from data import X_train, y_train, X_test, y_test

np.random.seed(10)
# Tweak hyperparameters here
model = MLPClassifier(max_iter=100, hidden_layer_sizes=(6, 6), learning_rate_init=0.01, random_state=10)

model.fit(X_train, y_train)

y_pred_train = model.predict(X_train)
y_pred_test = model.predict(X_test)
# Comparing train set accuracy and test set accuracy
train_accuracy = accuracy_score(y_train, y_pred_train)
test_accuracy = accuracy_score(y_test, y_pred_test)
print(f'Train accuracy: {train_accuracy:.3f}')
print(f'Test accuracy: {test_accuracy:.3f}')

Finding the right combination of hyperparameters involves systematic experimentation and adjustments. This is often done using techniques like grid search (trying all possible combinations of hyperparameters) and random search (testing a random subset of hyperparameter values).

Essentially,training a neural network follows an iterative cycle:

Define the model with initial hyperparameters;
Train the model using the training dataset;
Evaluate performance on a test set;
Adjust hyperparameters (e.g., number of layers, learning rate);
Repeat the process until the desired performance is achieved.

This iterative refinement ensures that the model generalizes well to unseen data.

¿Todo estuvo claro?

¡Gracias por tus comentarios!

Sección 3. Capítulo 2

Pregunte a AI

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla