Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Impara Moore–Aronszajn Theorem | Kernels as Inner Products
Reproducing Kernel Hilbert Spaces Theory

bookMoore–Aronszajn Theorem

The Moore–Aronszajn theorem is a foundational result in the theory of reproducing kernel Hilbert spaces (RKHS). It formally establishes a deep correspondence between positive definite kernels and Hilbert spaces of functions, clarifying how every such kernel uniquely determines a Hilbert space in which it serves as an inner product for function evaluation.

Let XX be a nonempty set, and let K:X×XRK : X \times X \to \mathbb{R} be a symmetric, positive definite kernel. The theorem states:

Moore–Aronszajn Theorem:
For every positive definite kernel KK on XX, there exists a unique Hilbert space H\mathcal{H} of functions f:XRf : X \to \mathbb{R} such that:

  1. For every xXx \in X, the function K(,x)K(\cdot, x) belongs to H\mathcal{H};
  2. For every fHf \in \mathcal{H} and every xXx \in X, the reproducing property holds: f(x)=f,K(,x)Hf(x) = \langle f, K(\cdot, x) \rangle_{\mathcal{H}}

Moreover, H\mathcal{H} is called the reproducing kernel Hilbert space associated with KK, and KK is its reproducing kernel.

To understand why this correspondence exists and is unique, consider the following proof sketch. The proof has two main parts: existence and uniqueness.

Existence:
Given a positive definite kernel KK, you can construct a vector space of finite linear combinations of the form

f=i=1nαiK(,xi)f = \sum_{i=1}^n \alpha_i K(\cdot, x_i)

where xiXx_i \in X and αiR\alpha_i \in \mathbb{R}. Define an inner product on this space by

i=1nαiK(,xi),j=1mβjK(,yj)=i=1nj=1mαiβjK(xi,yj)\left\langle \sum_{i=1}^n \alpha_i K(\cdot, x_i), \sum_{j=1}^m \beta_j K(\cdot, y_j) \right\rangle = \sum_{i=1}^n \sum_{j=1}^m \alpha_i \beta_j K(x_i, y_j)

This inner product is well-defined and positive definite due to the properties of KK. Completing this space with respect to the induced norm yields a Hilbert space H\mathcal{H} of functions on XX. By construction, K(,x)HK(\cdot, x) \in \mathcal{H} for all xx, and the reproducing property holds: for any fHf \in \mathcal{H} and xXx \in X, f(x)=f,K(,x)Hf(x) = \langle f, K(\cdot, x) \rangle_{\mathcal{H}}.

Uniqueness:
Suppose there are two Hilbert spaces of functions on XX with reproducing kernel KK. The construction above shows that any function in either space can be written as a limit of finite linear combinations of K(,x)K(\cdot, x). The inner product must agree on these combinations, so the two spaces coincide as Hilbert spaces. Thus, the RKHS associated with KK is unique.

The consequences of the Moore–Aronszajn theorem are far-reaching. It provides the mathematical justification for using kernels in functional analysis, as it guarantees that every positive definite kernel gives rise to a unique Hilbert space of functions with powerful evaluation properties. In machine learning, this underpins kernel methods such as support vector machines, kernel ridge regression, and Gaussian processes: any algorithm that relies on a positive definite kernel can be interpreted as operating in an implicit Hilbert space of functions, even when that space is infinite-dimensional. This insight enables you to design algorithms that handle nonlinear relationships and complex data structures using only kernel evaluations.

Note
Definition

Definition:

  • Kernel: A function K:X×XRK : X \times X \to \mathbb{R} that is symmetric (K(x,y)=K(y,x)K(x, y) = K(y, x)) and positive definite (for any finite set {x1,...,xn}X\{x_1, ..., x_n\} \subset X, the matrix [K(xi,xj)][K(x_i, x_j)] is positive semidefinite);
  • Section: For fixed xXx \in X, the function K(,x)K(\cdot, x) is called the section of KK at xx;
  • Reproducing property: For all ff in the RKHS and xXx \in X, f(x)=f,K(,x)Hf(x) = \langle f, K(\cdot, x) \rangle_{\mathcal{H}}.

From a geometric perspective, the Moore–Aronszajn theorem reveals that positive definite kernels act like inner products in a (possibly infinite-dimensional) Hilbert space of functions. Each point xXx \in X is associated with the section K(,x)K(\cdot, x), which can be viewed as a feature vector in the RKHS. The kernel K(x,y)K(x, y) computes the inner product between the feature vectors corresponding to xx and yy. This visualization allows you to interpret kernel methods as linear operations in a high-dimensional feature space, even if you never explicitly construct the space itself. The theorem thus bridges abstract functional analysis and practical computation, making the power of Hilbert space geometry available for analyzing and modeling complex data.

question mark

Which statement best summarizes the Moore–Aronszajn theorem in the context of reproducing kernel Hilbert spaces?

Select the correct answer

Tutto è chiaro?

Come possiamo migliorarlo?

Grazie per i tuoi commenti!

Sezione 1. Capitolo 3

Chieda ad AI

expand

Chieda ad AI

ChatGPT

Chieda pure quello che desidera o provi una delle domande suggerite per iniziare la nostra conversazione

Suggested prompts:

Can you explain the reproducing property in more detail?

How does the Moore–Aronszajn theorem relate to kernel methods in machine learning?

Can you give an example of constructing an RKHS for a specific kernel?

bookMoore–Aronszajn Theorem

Scorri per mostrare il menu

The Moore–Aronszajn theorem is a foundational result in the theory of reproducing kernel Hilbert spaces (RKHS). It formally establishes a deep correspondence between positive definite kernels and Hilbert spaces of functions, clarifying how every such kernel uniquely determines a Hilbert space in which it serves as an inner product for function evaluation.

Let XX be a nonempty set, and let K:X×XRK : X \times X \to \mathbb{R} be a symmetric, positive definite kernel. The theorem states:

Moore–Aronszajn Theorem:
For every positive definite kernel KK on XX, there exists a unique Hilbert space H\mathcal{H} of functions f:XRf : X \to \mathbb{R} such that:

  1. For every xXx \in X, the function K(,x)K(\cdot, x) belongs to H\mathcal{H};
  2. For every fHf \in \mathcal{H} and every xXx \in X, the reproducing property holds: f(x)=f,K(,x)Hf(x) = \langle f, K(\cdot, x) \rangle_{\mathcal{H}}

Moreover, H\mathcal{H} is called the reproducing kernel Hilbert space associated with KK, and KK is its reproducing kernel.

To understand why this correspondence exists and is unique, consider the following proof sketch. The proof has two main parts: existence and uniqueness.

Existence:
Given a positive definite kernel KK, you can construct a vector space of finite linear combinations of the form

f=i=1nαiK(,xi)f = \sum_{i=1}^n \alpha_i K(\cdot, x_i)

where xiXx_i \in X and αiR\alpha_i \in \mathbb{R}. Define an inner product on this space by

i=1nαiK(,xi),j=1mβjK(,yj)=i=1nj=1mαiβjK(xi,yj)\left\langle \sum_{i=1}^n \alpha_i K(\cdot, x_i), \sum_{j=1}^m \beta_j K(\cdot, y_j) \right\rangle = \sum_{i=1}^n \sum_{j=1}^m \alpha_i \beta_j K(x_i, y_j)

This inner product is well-defined and positive definite due to the properties of KK. Completing this space with respect to the induced norm yields a Hilbert space H\mathcal{H} of functions on XX. By construction, K(,x)HK(\cdot, x) \in \mathcal{H} for all xx, and the reproducing property holds: for any fHf \in \mathcal{H} and xXx \in X, f(x)=f,K(,x)Hf(x) = \langle f, K(\cdot, x) \rangle_{\mathcal{H}}.

Uniqueness:
Suppose there are two Hilbert spaces of functions on XX with reproducing kernel KK. The construction above shows that any function in either space can be written as a limit of finite linear combinations of K(,x)K(\cdot, x). The inner product must agree on these combinations, so the two spaces coincide as Hilbert spaces. Thus, the RKHS associated with KK is unique.

The consequences of the Moore–Aronszajn theorem are far-reaching. It provides the mathematical justification for using kernels in functional analysis, as it guarantees that every positive definite kernel gives rise to a unique Hilbert space of functions with powerful evaluation properties. In machine learning, this underpins kernel methods such as support vector machines, kernel ridge regression, and Gaussian processes: any algorithm that relies on a positive definite kernel can be interpreted as operating in an implicit Hilbert space of functions, even when that space is infinite-dimensional. This insight enables you to design algorithms that handle nonlinear relationships and complex data structures using only kernel evaluations.

Note
Definition

Definition:

  • Kernel: A function K:X×XRK : X \times X \to \mathbb{R} that is symmetric (K(x,y)=K(y,x)K(x, y) = K(y, x)) and positive definite (for any finite set {x1,...,xn}X\{x_1, ..., x_n\} \subset X, the matrix [K(xi,xj)][K(x_i, x_j)] is positive semidefinite);
  • Section: For fixed xXx \in X, the function K(,x)K(\cdot, x) is called the section of KK at xx;
  • Reproducing property: For all ff in the RKHS and xXx \in X, f(x)=f,K(,x)Hf(x) = \langle f, K(\cdot, x) \rangle_{\mathcal{H}}.

From a geometric perspective, the Moore–Aronszajn theorem reveals that positive definite kernels act like inner products in a (possibly infinite-dimensional) Hilbert space of functions. Each point xXx \in X is associated with the section K(,x)K(\cdot, x), which can be viewed as a feature vector in the RKHS. The kernel K(x,y)K(x, y) computes the inner product between the feature vectors corresponding to xx and yy. This visualization allows you to interpret kernel methods as linear operations in a high-dimensional feature space, even if you never explicitly construct the space itself. The theorem thus bridges abstract functional analysis and practical computation, making the power of Hilbert space geometry available for analyzing and modeling complex data.

question mark

Which statement best summarizes the Moore–Aronszajn theorem in the context of reproducing kernel Hilbert spaces?

Select the correct answer

Tutto è chiaro?

Come possiamo migliorarlo?

Grazie per i tuoi commenti!

Sezione 1. Capitolo 3
some-alt