Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Gradients in PyTorch | More Advanced Concepts
PyTorch Essentials

book
Gradients in PyTorch

Gradients are fundamental in optimization tasks like training neural networks, where they help adjust weights and biases to minimize error. In PyTorch, they are calculated automatically using the autograd module, which tracks operations on tensors and computes derivatives efficiently.

Enabling Gradient Tracking

To enable gradient tracking for a tensor, the requires_grad=True argument is used when creating the tensor. This tells PyTorch to track all operations on the tensor for gradient computation.

import torch
# Create a tensor with gradient tracking enabled
tensor = torch.tensor(2.0, requires_grad=True)
print(tensor)
1234
import torch # Create a tensor with gradient tracking enabled tensor = torch.tensor(2.0, requires_grad=True) print(tensor)
copy

Building a Computational Graph

PyTorch builds a dynamic computational graph as you perform operations on tensors with requires_grad=True. This graph stores the relationships between tensors and operations, enabling automatic differentiation.

We'll start by defining a rather simple polynomial function:

y = 5x3 + 2x2 + 4x + 8

Our goal is to calculate the derivative with respect to x at x = 2.

import torch
# Define the tensor
x = torch.tensor(2.0, requires_grad=True)
# Define the function
y = 5 * x ** 3 + 2 * x ** 2 + 4 * x + 8
print(f"Function output: {y}")
123456
import torch # Define the tensor x = torch.tensor(2.0, requires_grad=True) # Define the function y = 5 * x ** 3 + 2 * x ** 2 + 4 * x + 8 print(f"Function output: {y}")
copy

The visualization of this computational graph created using the PyTorchViz library may appear somewhat complex, but it effectively conveys the key idea behind it:

Calculating Gradients

To compute the gradient, the backward() method should be called on the output tensor. This computes the derivative of the function with respect to the input tensor.

The computed gradient itself can then be accessed with the .grad attribute.

import torch
x = torch.tensor(2.0, requires_grad=True)
y = 5 * x ** 3 + 2 * x ** 2 + 4 * x + 8
# Perform backpropagation
y.backward()
# Print the gradient of x
grad = x.grad
print(f"Gradient of x: {grad}")
12345678
import torch x = torch.tensor(2.0, requires_grad=True) y = 5 * x ** 3 + 2 * x ** 2 + 4 * x + 8 # Perform backpropagation y.backward() # Print the gradient of x grad = x.grad print(f"Gradient of x: {grad}")
copy

The computed gradient is the derivative of y with respect to x, evaluated at x = 2.

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 2. Chapter 1

Ask AI

expand
ChatGPT

Ask anything or try one of the suggested questions to begin our chat

some-alt