Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Introduction to Matrix Decomposition | Linear Algebra Foundations
Mathematics for Data Science

bookIntroduction to Matrix Decomposition

Solving systems like Ax=bA \vec{x} = \vec{b} can be computationally intensive, especially for large systems.

Matrix decomposition simplifies this process by breaking matrix AA into simpler parts - which we can then solve in stages.

LU vs QR

We decompose the matrix AA into other structured matrices.

LU Decomposition

Break AA into a Lower and Upper triangular matrix:

  • Built using Gaussian elimination;
  • Works best for square matrices.
A=LUA = LU

QR Decomposition

Break AA into an Orthogonal and Upper matrix:

  • Often used for non-square matrices;
  • Ideal for least squares problems or when LU fails.
A=QRA = QR

LU Decomposition

Start with a square matrix:

A=[4363]A = \begin{bmatrix} 4 & 3 \\ 6 & 3 \end{bmatrix}

Our goal is to write this as:

A=LUA = LU

Where:

L=[10l211],  U=[u11u120u22]L = \begin{bmatrix} 1 & 0 \\ l_{21} & 1 \end{bmatrix},\ \ U = \begin{bmatrix} u_{11} & u_{12} \\ 0 & u_{22} \end{bmatrix}

This decomposition is possible if A is square and invertible.

Important Points:

  • Lower triangular matrices have all zero entries above the diagonal, simplifying forward substitution;
  • Upper triangular matrices have zeros below the diagonal, making backward substitution straightforward;
  • An orthogonal matrix has columns that are orthonormal vectors (vectors of length 1 that are perpendicular);
  • This property preserves vector length and angles, which is useful in solving least squares and improving numerical stability.

Gaussian Elimination

Apply Gaussian elimination to eliminate the entry below the top-left pivot:

R2R264R1R_2 \rarr R_2 - \frac{6}{4}R_1

This gives us:

R2=[0,1.5]R'_2 = [0, -1.5]

So the updated matrices become:

U=[4301.5]U = \begin{bmatrix} 4 & 3 \\ 0 & -1.5 \end{bmatrix}

And from our row operation, we know:

L=[101.51]L = \begin{bmatrix} 1 & 0 \\ 1.5 & 1 \end{bmatrix}

Important Points:

  • Gaussian elimination systematically eliminates entries below the pivot element in each column by subtracting scaled versions of the pivot row from the rows beneath;
  • This process transforms A into an upper triangular matrix U;
  • The multipliers used to eliminate these entries are stored in L, allowing us to represent A as the product LU.

LU Decomposition Result

We verify:

A=LU=[101.51][4301.5]=[4363]A = LU = \begin{bmatrix} 1 & 0 \\ 1.5 & 1 \end{bmatrix} \begin{bmatrix} 4 & 3 \\ 0 & -1.5 \end{bmatrix} = \begin{bmatrix} 4 & 3 \\ 6 & 3 \end{bmatrix}

Now the system Ax=bA \vec{x} = \vec{b} can be solved in two steps:

  1. Solve Ly=bL \vec{y} = \vec{b} by forward substitution;
  2. Solve Ux=yU \vec{x} = \vec{y} by backward substitution.

QR Decomposition

We want to express a matrix AA as a product of two matrices:

A=QRA = QR

Where:

  • AA is your input matrix (e.g. data, coefficients, etc.);
  • QQ is an orthogonal matrix (its columns are orthonormal vectors);
  • RR is an upper triangular matrix.

An example shape breakdown:

A=[a1a2a3a4]=[q1q2q3q4][r11r120r22]A = \begin{bmatrix} a_1 & a_2 \\ a_3 & a_4 \end{bmatrix} = \begin{bmatrix} q_1 & q_2 \\ q_3 & q_4 \end{bmatrix} \begin{bmatrix} r_{11} & r_{12} \\ 0 & r_{22} \end{bmatrix}

This decomposition is often used when:

  • Matrix A is not square;
  • Solving least squares problems;
  • LU decomposition isn't stable.

What Are Orthonormal Vectors?

Orthogonal vectors

Two vectors u,vu, v are orthogonal if their dot product is zero:

uv=0u \cdot v = 0

Normalized vector

A vector uu is normalized when u=1|u| = 1.

Orthonormal set

A set of vectors {q1,q2,...,qk}\{q_1, q_2, ..., q_k\} is orthonormal if each is unit length and they are mutually orthogonal:

qiqj={1, if  i=j,0, if  ij.q_i \cdot q_j = \begin{cases} 1,\ \text{if}\ \ i = j,\\ 0,\ \text{if}\ \ i \neq j. \end{cases}

Why it matters: orthonormal columns in QQ preserve geometry, simplify projections, and improve numerical stability.

Define the Matrix A

Let's start with this example:

A=[4363]A = \begin{bmatrix} 4 & 3 \\ 6 & 3 \end{bmatrix}

We will use the Gram-Schmidt process to find matrices QQ and RR such that A=QRA=QR. The Gram-Schmidt process creates an orthonormal set of vectors from the columns of AA.

This means the vectors in QQ are all perpendicular (orthogonal) to each other and have unit length (normalized). This property simplifies many calculations and improves numerical stability when solving systems.

So, here the goal is to:

  • Make the columns of QQ orthonormal;
  • Create the matrix RR which will encode the projections.

Compute First Basis Vector

We extract the first column of AA:

a1=[46]a_1 = \begin{bmatrix} 4 \\ 6 \end{bmatrix}

To normalize this, we compute the norm:

a1=42+62=16+36=52|a_1| = \sqrt{4^2 + 6^2} = \sqrt{16 + 36} = \sqrt{52}

Then:

q1=152[46]=[452652]q_1 = \frac{1}{\sqrt{52}} \begin{bmatrix} 4 \\ 6 \end{bmatrix} = \begin{bmatrix} \frac{4}{\sqrt{52}} \\ \frac{6}{\sqrt{52}} \end{bmatrix}

This is the first orthonormal vector for QQ.

How to Normalize a Vector

Given a vector:

v=[v1v2vn]v = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix}

We compute its norm:

v=v12+v22+...+vn2|v| = \sqrt{v_1^2 + v_2^2 + ... + v^2_n}

Then normalize:

v^=1vv\hat{v} = \frac{1}{|v|}v

Example:

v=[34],  v=32+42=5v = \begin{bmatrix} 3 \\ 4 \end{bmatrix},\ \ |v| = \sqrt{3^2 + 4^2} = 5

So, our normalized vector is:

v^=15[34]=[0.60.8]\hat{v} = \frac{1}{5}\begin{bmatrix} 3 \\ 4 \end{bmatrix} = \begin{bmatrix} 0.6 \\ 0.8 \end{bmatrix}

Once we know how to normalize and orthogonalize vectors, we can apply the Gram-Schmidt process to form the QQ matrix, and use it to compute RR in the QR decomposition.

Compute q₂ Using Gram-Schmidt

To compute q2q_2, we start with the second column of AA:

a2=[33]a_2 = \begin{bmatrix} 3 \\ 3 \end{bmatrix}

Next, you project a2a_2 onto q1q_1:

r12=q1Ta2=152(43+63)=15230r_{12} = q_1^Ta_2 = \frac{1}{\sqrt{52}}(4 \cdot 3 + 6 \cdot 3) = \frac{1}{\sqrt{52}} \cdot 30

Remove the projection from a2a_2:

u2=a2r12q1u_2 = a_2 - r_{12}q_1

Then normalize (as was shown above):

q2=u2u2q_2 = \frac{u_2}{|u_2|}

Now both q1q_1 and q2q_2 form the orthonormal basis for QQ. You now assemble the final result:

Q=[q1q2],  R=[r11r120r22]Q = \begin{bmatrix} q_1 & q_2 \end{bmatrix},\ \ R = \begin{bmatrix} r_{11} & r_{12} \\ 0 & r_{22} \end{bmatrix}

These satisfy:

A=QRA = QR
question mark

What is the first step in the Gram-Schmidt process for QR decomposition?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 4. Chapter 8

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain the main differences between LU and QR decomposition?

How do I know when to use LU decomposition versus QR decomposition?

Can you walk me through the steps of the Gram-Schmidt process in more detail?

Awesome!

Completion rate improved to 1.96

bookIntroduction to Matrix Decomposition

Swipe to show menu

Solving systems like Ax=bA \vec{x} = \vec{b} can be computationally intensive, especially for large systems.

Matrix decomposition simplifies this process by breaking matrix AA into simpler parts - which we can then solve in stages.

LU vs QR

We decompose the matrix AA into other structured matrices.

LU Decomposition

Break AA into a Lower and Upper triangular matrix:

  • Built using Gaussian elimination;
  • Works best for square matrices.
A=LUA = LU

QR Decomposition

Break AA into an Orthogonal and Upper matrix:

  • Often used for non-square matrices;
  • Ideal for least squares problems or when LU fails.
A=QRA = QR

LU Decomposition

Start with a square matrix:

A=[4363]A = \begin{bmatrix} 4 & 3 \\ 6 & 3 \end{bmatrix}

Our goal is to write this as:

A=LUA = LU

Where:

L=[10l211],  U=[u11u120u22]L = \begin{bmatrix} 1 & 0 \\ l_{21} & 1 \end{bmatrix},\ \ U = \begin{bmatrix} u_{11} & u_{12} \\ 0 & u_{22} \end{bmatrix}

This decomposition is possible if A is square and invertible.

Important Points:

  • Lower triangular matrices have all zero entries above the diagonal, simplifying forward substitution;
  • Upper triangular matrices have zeros below the diagonal, making backward substitution straightforward;
  • An orthogonal matrix has columns that are orthonormal vectors (vectors of length 1 that are perpendicular);
  • This property preserves vector length and angles, which is useful in solving least squares and improving numerical stability.

Gaussian Elimination

Apply Gaussian elimination to eliminate the entry below the top-left pivot:

R2R264R1R_2 \rarr R_2 - \frac{6}{4}R_1

This gives us:

R2=[0,1.5]R'_2 = [0, -1.5]

So the updated matrices become:

U=[4301.5]U = \begin{bmatrix} 4 & 3 \\ 0 & -1.5 \end{bmatrix}

And from our row operation, we know:

L=[101.51]L = \begin{bmatrix} 1 & 0 \\ 1.5 & 1 \end{bmatrix}

Important Points:

  • Gaussian elimination systematically eliminates entries below the pivot element in each column by subtracting scaled versions of the pivot row from the rows beneath;
  • This process transforms A into an upper triangular matrix U;
  • The multipliers used to eliminate these entries are stored in L, allowing us to represent A as the product LU.

LU Decomposition Result

We verify:

A=LU=[101.51][4301.5]=[4363]A = LU = \begin{bmatrix} 1 & 0 \\ 1.5 & 1 \end{bmatrix} \begin{bmatrix} 4 & 3 \\ 0 & -1.5 \end{bmatrix} = \begin{bmatrix} 4 & 3 \\ 6 & 3 \end{bmatrix}

Now the system Ax=bA \vec{x} = \vec{b} can be solved in two steps:

  1. Solve Ly=bL \vec{y} = \vec{b} by forward substitution;
  2. Solve Ux=yU \vec{x} = \vec{y} by backward substitution.

QR Decomposition

We want to express a matrix AA as a product of two matrices:

A=QRA = QR

Where:

  • AA is your input matrix (e.g. data, coefficients, etc.);
  • QQ is an orthogonal matrix (its columns are orthonormal vectors);
  • RR is an upper triangular matrix.

An example shape breakdown:

A=[a1a2a3a4]=[q1q2q3q4][r11r120r22]A = \begin{bmatrix} a_1 & a_2 \\ a_3 & a_4 \end{bmatrix} = \begin{bmatrix} q_1 & q_2 \\ q_3 & q_4 \end{bmatrix} \begin{bmatrix} r_{11} & r_{12} \\ 0 & r_{22} \end{bmatrix}

This decomposition is often used when:

  • Matrix A is not square;
  • Solving least squares problems;
  • LU decomposition isn't stable.

What Are Orthonormal Vectors?

Orthogonal vectors

Two vectors u,vu, v are orthogonal if their dot product is zero:

uv=0u \cdot v = 0

Normalized vector

A vector uu is normalized when u=1|u| = 1.

Orthonormal set

A set of vectors {q1,q2,...,qk}\{q_1, q_2, ..., q_k\} is orthonormal if each is unit length and they are mutually orthogonal:

qiqj={1, if  i=j,0, if  ij.q_i \cdot q_j = \begin{cases} 1,\ \text{if}\ \ i = j,\\ 0,\ \text{if}\ \ i \neq j. \end{cases}

Why it matters: orthonormal columns in QQ preserve geometry, simplify projections, and improve numerical stability.

Define the Matrix A

Let's start with this example:

A=[4363]A = \begin{bmatrix} 4 & 3 \\ 6 & 3 \end{bmatrix}

We will use the Gram-Schmidt process to find matrices QQ and RR such that A=QRA=QR. The Gram-Schmidt process creates an orthonormal set of vectors from the columns of AA.

This means the vectors in QQ are all perpendicular (orthogonal) to each other and have unit length (normalized). This property simplifies many calculations and improves numerical stability when solving systems.

So, here the goal is to:

  • Make the columns of QQ orthonormal;
  • Create the matrix RR which will encode the projections.

Compute First Basis Vector

We extract the first column of AA:

a1=[46]a_1 = \begin{bmatrix} 4 \\ 6 \end{bmatrix}

To normalize this, we compute the norm:

a1=42+62=16+36=52|a_1| = \sqrt{4^2 + 6^2} = \sqrt{16 + 36} = \sqrt{52}

Then:

q1=152[46]=[452652]q_1 = \frac{1}{\sqrt{52}} \begin{bmatrix} 4 \\ 6 \end{bmatrix} = \begin{bmatrix} \frac{4}{\sqrt{52}} \\ \frac{6}{\sqrt{52}} \end{bmatrix}

This is the first orthonormal vector for QQ.

How to Normalize a Vector

Given a vector:

v=[v1v2vn]v = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix}

We compute its norm:

v=v12+v22+...+vn2|v| = \sqrt{v_1^2 + v_2^2 + ... + v^2_n}

Then normalize:

v^=1vv\hat{v} = \frac{1}{|v|}v

Example:

v=[34],  v=32+42=5v = \begin{bmatrix} 3 \\ 4 \end{bmatrix},\ \ |v| = \sqrt{3^2 + 4^2} = 5

So, our normalized vector is:

v^=15[34]=[0.60.8]\hat{v} = \frac{1}{5}\begin{bmatrix} 3 \\ 4 \end{bmatrix} = \begin{bmatrix} 0.6 \\ 0.8 \end{bmatrix}

Once we know how to normalize and orthogonalize vectors, we can apply the Gram-Schmidt process to form the QQ matrix, and use it to compute RR in the QR decomposition.

Compute q₂ Using Gram-Schmidt

To compute q2q_2, we start with the second column of AA:

a2=[33]a_2 = \begin{bmatrix} 3 \\ 3 \end{bmatrix}

Next, you project a2a_2 onto q1q_1:

r12=q1Ta2=152(43+63)=15230r_{12} = q_1^Ta_2 = \frac{1}{\sqrt{52}}(4 \cdot 3 + 6 \cdot 3) = \frac{1}{\sqrt{52}} \cdot 30

Remove the projection from a2a_2:

u2=a2r12q1u_2 = a_2 - r_{12}q_1

Then normalize (as was shown above):

q2=u2u2q_2 = \frac{u_2}{|u_2|}

Now both q1q_1 and q2q_2 form the orthonormal basis for QQ. You now assemble the final result:

Q=[q1q2],  R=[r11r120r22]Q = \begin{bmatrix} q_1 & q_2 \end{bmatrix},\ \ R = \begin{bmatrix} r_{11} & r_{12} \\ 0 & r_{22} \end{bmatrix}

These satisfy:

A=QRA = QR
question mark

What is the first step in the Gram-Schmidt process for QR decomposition?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 4. Chapter 8
some-alt