Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Gradient Descent Method | Mathematical Analysis
Mathematics for Data Analysis and Modeling

Gradient Descent MethodGradient Descent Method

We know how to solve optimization problems for the function of one variable using the algorithm described in the previous chapter. But what can we do if we have a function of multiple variables? We can turn to a numerical method - gradient descent.

What is gradient descent?

Gradient is a vector that consists of all partial derivatives of the function:

content

Thus, the problem of minimisation of function F(x1,..,xn) can be solved by constructing the following sequence of approximations:

content

We set a certain initial value x0 and η value representing the speed of gradient descent. Then we start the iterative process according to the formula above.

Stop criteria of the algorithm

The criteria for stopping iterations can be as follows:

  1. Stop the algorithm after a certain number of iterations;
  2. Iterate until the following condition is met:
content

Note

eps = 10**(-6) or eps = 10**(-9) values are commonly used as the stop criterion of the iteration process.

We have to pay attention to two important features of the gradient descent method:

  1. This method can only find the point of minimum of the function F(x). If you want to find a point of maximum, you can consider function -F(x) and use gradient descent for it;
  2. If we compare the algorithm we discussed earlier with gradient descent, we can see that gradient descent performs a similar task to the first stage of the algorithm - finding a critical value, which might be a potential minimum point. As a result, it is possible that the point found by gradient descent may either be a local minimum within some subset of the domain or not a minimum point at all.

Example

Let's find out how to solve the optimization problem in Python:

In this example, we define the Rosenbrock function, set an initial guess for x, and then use scipy.optimize.minimize to find the minimum of the Rosenbrock function. The result.x attribute contains the optimal x, and result.fun contains the minimum value of the Rosenbrock function.

Note

The Rosenbrock function is often used as a benchmark for testing and comparing optimization algorithms due to its non-convex nature and the presence of a narrow, curved minimum valley.

¿Todo estuvo claro?

Sección 3. Capítulo 6
course content

Contenido del Curso

Mathematics for Data Analysis and Modeling

Gradient Descent MethodGradient Descent Method

We know how to solve optimization problems for the function of one variable using the algorithm described in the previous chapter. But what can we do if we have a function of multiple variables? We can turn to a numerical method - gradient descent.

What is gradient descent?

Gradient is a vector that consists of all partial derivatives of the function:

content

Thus, the problem of minimisation of function F(x1,..,xn) can be solved by constructing the following sequence of approximations:

content

We set a certain initial value x0 and η value representing the speed of gradient descent. Then we start the iterative process according to the formula above.

Stop criteria of the algorithm

The criteria for stopping iterations can be as follows:

  1. Stop the algorithm after a certain number of iterations;
  2. Iterate until the following condition is met:
content

Note

eps = 10**(-6) or eps = 10**(-9) values are commonly used as the stop criterion of the iteration process.

We have to pay attention to two important features of the gradient descent method:

  1. This method can only find the point of minimum of the function F(x). If you want to find a point of maximum, you can consider function -F(x) and use gradient descent for it;
  2. If we compare the algorithm we discussed earlier with gradient descent, we can see that gradient descent performs a similar task to the first stage of the algorithm - finding a critical value, which might be a potential minimum point. As a result, it is possible that the point found by gradient descent may either be a local minimum within some subset of the domain or not a minimum point at all.

Example

Let's find out how to solve the optimization problem in Python:

In this example, we define the Rosenbrock function, set an initial guess for x, and then use scipy.optimize.minimize to find the minimum of the Rosenbrock function. The result.x attribute contains the optimal x, and result.fun contains the minimum value of the Rosenbrock function.

Note

The Rosenbrock function is often used as a benchmark for testing and comparing optimization algorithms due to its non-convex nature and the presence of a narrow, curved minimum valley.

¿Todo estuvo claro?

Sección 3. Capítulo 6
some-alt