Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprende Training of the Model | Neural Networks in PyTorch
PyTorch Essentials
course content

Contenido del Curso

PyTorch Essentials

PyTorch Essentials

1. PyTorch Introduction
2. More Advanced Concepts
3. Neural Networks in PyTorch

book
Training of the Model

Preparing for Training

First, you need to ensure that the model, loss function, and optimizer are properly defined. Let's go through each step:

  1. Loss function: for classification, you can use CrossEntropyLoss, which expects raw continuous values (logits) as input and automatically applies softmax;
  2. Optimizer: you can use the Adam optimizer for efficient gradient updates.

In PyTorch, cross-entropy loss combines log-softmax and negative log-likelihood (NLL) loss into a single loss function:

where:

  • zy is the logit corresponding to the correct class;
  • C is the total number of classes.

It is also important to split the data into training and validation sets (ideally, a separate test set should also exist). Since the dataset is relatively small (1143 rows), we use an 80% to 20% split. In this case, the validation set will also serve as the test set.

Moreover, the resulting NumPy arrays should be converted to tensors, as PyTorch models require tensor inputs for computations.

Training Loop

The training loop involves the following steps for each epoch:

  1. Forward pass: pass the input features through the model to generate predictions;
  2. Loss calculation: compare the predictions with the ground truth using the loss function;
  3. Backward pass: compute gradients with respect to the model parameters using backpropagation;
  4. Parameter update: adjust model parameters using the optimizer;
  5. Monitoring progress: print the loss periodically to observe convergence.

As you can see, the training process is similar to that of linear regression.

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849
import torch.nn as nn import torch import torch.optim as optim import matplotlib.pyplot as plt import os os.system('wget https://staging-content-media-cdn.codefinity.com/courses/1dd2b0f6-6ec0-40e6-a570-ed0ac2209666/section_3/model_definition.py 2>/dev/null') from model_definition import model, X, y from sklearn.model_selection import train_test_split # Set manual seed for reproducibility torch.manual_seed(42) # Reinitialize model after setting seed model.apply(lambda m: m.reset_parameters() if hasattr(m, "reset_parameters") else None) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) X_train = torch.tensor(X_train, dtype=torch.float32) X_test = torch.tensor(X_test, dtype=torch.float32) y_train = torch.tensor(y_train, dtype=torch.long) y_test = torch.tensor(y_test, dtype=torch.long) # Define the loss function (Cross-Entropy for multi-class classification) criterion = nn.CrossEntropyLoss() # Define the optimizer (Adam with a learning rate of 0.01) optimizer = optim.Adam(model.parameters(), lr=0.01) # Number of epochs epochs = 100 # Store losses for plotting training_losses = [] # Training loop for epoch in range(epochs): # Zero out gradients from the previous step optimizer.zero_grad() # Compute predictions predictions = model(X_train) # Compute the loss loss = criterion(predictions, y_train) # Compute gradients loss.backward() # Update parameters optimizer.step() # Store the loss training_losses.append(loss.item()) # Plot the training loss plt.plot(range(epochs), training_losses, label="Training Loss") plt.xlabel("Epoch") plt.ylabel("Loss") plt.title("Training Loss over Epochs") plt.legend() plt.show()
copy

Observing Convergence

In addition to training the model, we also record the training loss at each epoch and plot it over time. As shown in the graph, the training loss initially decreases rapidly and then gradually stabilizes around epoch 60. Beyond this point, the loss decreases at a much slower rate, suggesting that the model has likely converged. Therefore, using around 40 epochs for this model would be sufficient.

Which of the following is the correct sequence of steps in a PyTorch training loop?

Which of the following is the correct sequence of steps in a PyTorch training loop?

Selecciona la respuesta correcta

¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 3. Capítulo 2
We're sorry to hear that something went wrong. What happened?
some-alt