Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprenda Convexity and Optimization Objectives | Mathematical Foundations
Mathematics of Optimization in ML

bookConvexity and Optimization Objectives

Convexity is a central concept in optimization, especially in machine learning, where you often want to minimize a loss function efficiently and reliably. To understand why convexity matters, you first need to know what makes a set or a function convex.

A convex set in a vector space is a set where, for any two points within it, the straight line connecting them also lies entirely within the set. Formally, a set CC is convex if for any x,yCx, y \in C and any λ\lambda such that 0λ10 \leq \lambda \leq 1:

λx+(1λ)yC\lambda x + (1 - \lambda) y \in C

A convex function is a function where the line segment between any two points on its graph lies above or on the graph itself. Mathematically, a function f:RnRf: \mathbb{R}^n \rightarrow \mathbb{R} is convex on a convex set CC if for all x,yCx, y \in C and 0λ10 \leq \lambda \leq 1:

f(λx+(1λ)y)λf(x)+(1λ)f(y)f(\lambda x + (1 - \lambda) y) \leq \lambda f(x) + (1 - \lambda) f(y)

In machine learning, many loss functions are chosen specifically because they are convex. For example, the mean squared error (MSE) loss in linear regression is convex, which ensures that optimization algorithms like gradient descent can find the global minimum efficiently. On the other hand, the set of solutions where the weights of a linear model satisfy certain constraints, such as all weights being non-negative, forms a convex set.

A simple example of a convex function is f(x)=x2f(x) = x^2, which forms a U-shaped parabola. A non-convex function, such as f(x)=x33xf(x) = x^3 - 3x, has both valleys and peaks, making optimization more challenging. In high-dimensional spaces, convexity ensures that any local minimum is also a global minimum, which is a powerful property for optimization in machine learning.

Note
Note

Convexity is important in optimization because, for convex functions over convex sets, any local minimum is guaranteed to be a global minimum. This property eliminates the risk of getting stuck in suboptimal solutions and allows algorithms like gradient descent to converge reliably and efficiently.

123456789101112131415161718192021222324252627
import numpy as np import matplotlib.pyplot as plt x = np.linspace(-3, 3, 400) convex_f = x**2 nonconvex_f = x**3 - 3*x plt.figure(figsize=(10, 4)) plt.subplot(1, 2, 1) plt.plot(x, convex_f, label="Convex: $f(x) = x^2$") plt.title("Convex Function") plt.xlabel("x") plt.ylabel("f(x)") plt.legend() plt.grid(True) plt.subplot(1, 2, 2) plt.plot(x, nonconvex_f, color="orange", label="Non-convex: $f(x) = x^3 - 3x$") plt.title("Non-convex Function") plt.xlabel("x") plt.ylabel("f(x)") plt.legend() plt.grid(True) plt.tight_layout() plt.show()
copy
question mark

Which of the following statements about convexity is correct?

Select the correct answer

Tudo estava claro?

Como podemos melhorá-lo?

Obrigado pelo seu feedback!

Seção 1. Capítulo 2

Pergunte à IA

expand

Pergunte à IA

ChatGPT

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo

Suggested prompts:

Can you explain more about why convexity is important in optimization?

What are some other examples of convex and non-convex functions in machine learning?

How does convexity affect the performance of gradient descent?

Awesome!

Completion rate improved to 5.56

bookConvexity and Optimization Objectives

Deslize para mostrar o menu

Convexity is a central concept in optimization, especially in machine learning, where you often want to minimize a loss function efficiently and reliably. To understand why convexity matters, you first need to know what makes a set or a function convex.

A convex set in a vector space is a set where, for any two points within it, the straight line connecting them also lies entirely within the set. Formally, a set CC is convex if for any x,yCx, y \in C and any λ\lambda such that 0λ10 \leq \lambda \leq 1:

λx+(1λ)yC\lambda x + (1 - \lambda) y \in C

A convex function is a function where the line segment between any two points on its graph lies above or on the graph itself. Mathematically, a function f:RnRf: \mathbb{R}^n \rightarrow \mathbb{R} is convex on a convex set CC if for all x,yCx, y \in C and 0λ10 \leq \lambda \leq 1:

f(λx+(1λ)y)λf(x)+(1λ)f(y)f(\lambda x + (1 - \lambda) y) \leq \lambda f(x) + (1 - \lambda) f(y)

In machine learning, many loss functions are chosen specifically because they are convex. For example, the mean squared error (MSE) loss in linear regression is convex, which ensures that optimization algorithms like gradient descent can find the global minimum efficiently. On the other hand, the set of solutions where the weights of a linear model satisfy certain constraints, such as all weights being non-negative, forms a convex set.

A simple example of a convex function is f(x)=x2f(x) = x^2, which forms a U-shaped parabola. A non-convex function, such as f(x)=x33xf(x) = x^3 - 3x, has both valleys and peaks, making optimization more challenging. In high-dimensional spaces, convexity ensures that any local minimum is also a global minimum, which is a powerful property for optimization in machine learning.

Note
Note

Convexity is important in optimization because, for convex functions over convex sets, any local minimum is guaranteed to be a global minimum. This property eliminates the risk of getting stuck in suboptimal solutions and allows algorithms like gradient descent to converge reliably and efficiently.

123456789101112131415161718192021222324252627
import numpy as np import matplotlib.pyplot as plt x = np.linspace(-3, 3, 400) convex_f = x**2 nonconvex_f = x**3 - 3*x plt.figure(figsize=(10, 4)) plt.subplot(1, 2, 1) plt.plot(x, convex_f, label="Convex: $f(x) = x^2$") plt.title("Convex Function") plt.xlabel("x") plt.ylabel("f(x)") plt.legend() plt.grid(True) plt.subplot(1, 2, 2) plt.plot(x, nonconvex_f, color="orange", label="Non-convex: $f(x) = x^3 - 3x$") plt.title("Non-convex Function") plt.xlabel("x") plt.ylabel("f(x)") plt.legend() plt.grid(True) plt.tight_layout() plt.show()
copy
question mark

Which of the following statements about convexity is correct?

Select the correct answer

Tudo estava claro?

Como podemos melhorá-lo?

Obrigado pelo seu feedback!

Seção 1. Capítulo 2
some-alt