Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen Derivatives and Gradients | Mathematical Foundations
Mathematics of Optimization in ML

bookDerivatives and Gradients

Derivatives and gradients form the mathematical backbone of optimization in machine learning. A derivative measures how a function changes as its input changes. In one dimension, the derivative of a function f(x)f(x) at a point xx tells you the rate at which f(x)f(x) increases or decreases as you move slightly from xx. When dealing with functions of multiple variables, such as f(x,y)f(x, y), the concept generalizes to partial derivatives, which capture the rate of change of the function with respect to each variable independently, holding the others constant.

A gradient is a vector that collects all the partial derivatives of a function with respect to its inputs. For a function f(x,y)f(x, y), the gradient is written as:

f(x,y)=[fx,fy]∇f(x, y) = \left[ \frac{∂f}{∂x}, \frac{∂f}{∂y} \right]

This vector points in the direction of the greatest rate of increase of the function. In optimization, gradients are essential: they guide algorithms on how to adjust parameters to minimize or maximize an objective function. When you hear about moving in the direction of the negative gradient, it means taking steps that most rapidly decrease the value of the function, which is the core idea behind gradient-based optimization methods.

Note
Note

Think of the gradient as a compass that always points toward the direction of steepest ascent on a surface. If you want to climb a hill as quickly as possible, you would follow the direction of the gradient. Conversely, to descend as quickly as possible, like minimizing a loss function in machine learning, you go in the opposite direction, following the negative gradient.

1234567891011121314151617181920212223242526272829
import numpy as np import matplotlib.pyplot as plt # Define a simple 2D quadratic function: f(x, y) = x^2 + y^2 def f(x, y): return x**2 + y**2 # Compute the gradient of f def grad_f(x, y): df_dx = 2 * x df_dy = 2 * y return np.array([df_dx, df_dy]) # Create a grid of points x = np.linspace(-3, 3, 20) y = np.linspace(-3, 3, 20) X, Y = np.meshgrid(x, y) # Compute gradients at each grid point U = 2 * X V = 2 * Y plt.figure(figsize=(6, 6)) plt.quiver(X, Y, U, V, color="blue") plt.title("Gradient Vector Field of $f(x, y) = x^2 + y^2$") plt.xlabel("x") plt.ylabel("y") plt.grid(True) plt.show()
copy
question mark

Which statement best describes the role of the gradient in optimization?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 1

Fragen Sie AI

expand

Fragen Sie AI

ChatGPT

Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen

Awesome!

Completion rate improved to 5.56

bookDerivatives and Gradients

Swipe um das Menü anzuzeigen

Derivatives and gradients form the mathematical backbone of optimization in machine learning. A derivative measures how a function changes as its input changes. In one dimension, the derivative of a function f(x)f(x) at a point xx tells you the rate at which f(x)f(x) increases or decreases as you move slightly from xx. When dealing with functions of multiple variables, such as f(x,y)f(x, y), the concept generalizes to partial derivatives, which capture the rate of change of the function with respect to each variable independently, holding the others constant.

A gradient is a vector that collects all the partial derivatives of a function with respect to its inputs. For a function f(x,y)f(x, y), the gradient is written as:

f(x,y)=[fx,fy]∇f(x, y) = \left[ \frac{∂f}{∂x}, \frac{∂f}{∂y} \right]

This vector points in the direction of the greatest rate of increase of the function. In optimization, gradients are essential: they guide algorithms on how to adjust parameters to minimize or maximize an objective function. When you hear about moving in the direction of the negative gradient, it means taking steps that most rapidly decrease the value of the function, which is the core idea behind gradient-based optimization methods.

Note
Note

Think of the gradient as a compass that always points toward the direction of steepest ascent on a surface. If you want to climb a hill as quickly as possible, you would follow the direction of the gradient. Conversely, to descend as quickly as possible, like minimizing a loss function in machine learning, you go in the opposite direction, following the negative gradient.

1234567891011121314151617181920212223242526272829
import numpy as np import matplotlib.pyplot as plt # Define a simple 2D quadratic function: f(x, y) = x^2 + y^2 def f(x, y): return x**2 + y**2 # Compute the gradient of f def grad_f(x, y): df_dx = 2 * x df_dy = 2 * y return np.array([df_dx, df_dy]) # Create a grid of points x = np.linspace(-3, 3, 20) y = np.linspace(-3, 3, 20) X, Y = np.meshgrid(x, y) # Compute gradients at each grid point U = 2 * X V = 2 * Y plt.figure(figsize=(6, 6)) plt.quiver(X, Y, U, V, color="blue") plt.title("Gradient Vector Field of $f(x, y) = x^2 + y^2$") plt.xlabel("x") plt.ylabel("y") plt.grid(True) plt.show()
copy
question mark

Which statement best describes the role of the gradient in optimization?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 1
some-alt