Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Convexity and Optimization Objectives | Mathematical Foundations
Mathematics of Optimization in ML

bookConvexity and Optimization Objectives

Convexity is a central concept in optimization, especially in machine learning, where you often want to minimize a loss function efficiently and reliably. To understand why convexity matters, you first need to know what makes a set or a function convex.

A convex set in a vector space is a set where, for any two points within it, the straight line connecting them also lies entirely within the set. Formally, a set CC is convex if for any x,y∈Cx, y \in C and any Ξ»\lambda such that 0≀λ≀10 \leq \lambda \leq 1:

Ξ»x+(1βˆ’Ξ»)y∈C\lambda x + (1 - \lambda) y \in C

A convex function is a function where the line segment between any two points on its graph lies above or on the graph itself. Mathematically, a function f:Rnβ†’Rf: \mathbb{R}^n \rightarrow \mathbb{R} is convex on a convex set CC if for all x,y∈Cx, y \in C and 0≀λ≀10 \leq \lambda \leq 1:

f(Ξ»x+(1βˆ’Ξ»)y)≀λf(x)+(1βˆ’Ξ»)f(y)f(\lambda x + (1 - \lambda) y) \leq \lambda f(x) + (1 - \lambda) f(y)

In machine learning, many loss functions are chosen specifically because they are convex. For example, the mean squared error (MSE) loss in linear regression is convex, which ensures that optimization algorithms like gradient descent can find the global minimum efficiently. On the other hand, the set of solutions where the weights of a linear model satisfy certain constraints, such as all weights being non-negative, forms a convex set.

A simple example of a convex function is f(x)=x2f(x) = x^2, which forms a U-shaped parabola. A non-convex function, such as f(x)=x3βˆ’3xf(x) = x^3 - 3x, has both valleys and peaks, making optimization more challenging. In high-dimensional spaces, convexity ensures that any local minimum is also a global minimum, which is a powerful property for optimization in machine learning.

Note
Note

Convexity is important in optimization because, for convex functions over convex sets, any local minimum is guaranteed to be a global minimum. This property eliminates the risk of getting stuck in suboptimal solutions and allows algorithms like gradient descent to converge reliably and efficiently.

123456789101112131415161718192021222324252627
import numpy as np import matplotlib.pyplot as plt x = np.linspace(-3, 3, 400) convex_f = x**2 nonconvex_f = x**3 - 3*x plt.figure(figsize=(10, 4)) plt.subplot(1, 2, 1) plt.plot(x, convex_f, label="Convex: $f(x) = x^2$") plt.title("Convex Function") plt.xlabel("x") plt.ylabel("f(x)") plt.legend() plt.grid(True) plt.subplot(1, 2, 2) plt.plot(x, nonconvex_f, color="orange", label="Non-convex: $f(x) = x^3 - 3x$") plt.title("Non-convex Function") plt.xlabel("x") plt.ylabel("f(x)") plt.legend() plt.grid(True) plt.tight_layout() plt.show()
copy
question mark

Which of the following statements about convexity is correct?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 2

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Awesome!

Completion rate improved to 5.56

bookConvexity and Optimization Objectives

Swipe to show menu

Convexity is a central concept in optimization, especially in machine learning, where you often want to minimize a loss function efficiently and reliably. To understand why convexity matters, you first need to know what makes a set or a function convex.

A convex set in a vector space is a set where, for any two points within it, the straight line connecting them also lies entirely within the set. Formally, a set CC is convex if for any x,y∈Cx, y \in C and any Ξ»\lambda such that 0≀λ≀10 \leq \lambda \leq 1:

Ξ»x+(1βˆ’Ξ»)y∈C\lambda x + (1 - \lambda) y \in C

A convex function is a function where the line segment between any two points on its graph lies above or on the graph itself. Mathematically, a function f:Rnβ†’Rf: \mathbb{R}^n \rightarrow \mathbb{R} is convex on a convex set CC if for all x,y∈Cx, y \in C and 0≀λ≀10 \leq \lambda \leq 1:

f(Ξ»x+(1βˆ’Ξ»)y)≀λf(x)+(1βˆ’Ξ»)f(y)f(\lambda x + (1 - \lambda) y) \leq \lambda f(x) + (1 - \lambda) f(y)

In machine learning, many loss functions are chosen specifically because they are convex. For example, the mean squared error (MSE) loss in linear regression is convex, which ensures that optimization algorithms like gradient descent can find the global minimum efficiently. On the other hand, the set of solutions where the weights of a linear model satisfy certain constraints, such as all weights being non-negative, forms a convex set.

A simple example of a convex function is f(x)=x2f(x) = x^2, which forms a U-shaped parabola. A non-convex function, such as f(x)=x3βˆ’3xf(x) = x^3 - 3x, has both valleys and peaks, making optimization more challenging. In high-dimensional spaces, convexity ensures that any local minimum is also a global minimum, which is a powerful property for optimization in machine learning.

Note
Note

Convexity is important in optimization because, for convex functions over convex sets, any local minimum is guaranteed to be a global minimum. This property eliminates the risk of getting stuck in suboptimal solutions and allows algorithms like gradient descent to converge reliably and efficiently.

123456789101112131415161718192021222324252627
import numpy as np import matplotlib.pyplot as plt x = np.linspace(-3, 3, 400) convex_f = x**2 nonconvex_f = x**3 - 3*x plt.figure(figsize=(10, 4)) plt.subplot(1, 2, 1) plt.plot(x, convex_f, label="Convex: $f(x) = x^2$") plt.title("Convex Function") plt.xlabel("x") plt.ylabel("f(x)") plt.legend() plt.grid(True) plt.subplot(1, 2, 2) plt.plot(x, nonconvex_f, color="orange", label="Non-convex: $f(x) = x^3 - 3x$") plt.title("Non-convex Function") plt.xlabel("x") plt.ylabel("f(x)") plt.legend() plt.grid(True) plt.tight_layout() plt.show()
copy
question mark

Which of the following statements about convexity is correct?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 2
some-alt