Variance, Covariance, and the Covariance Matrix
Swipe to show menu
Variance measures how much a variable deviates from its mean.
The formula for variance of a variable x is:
Var(x)=n1i=1∑n(xi−xˉ)2Covariance measures how two variables change together.
The formula for Covariance of variables x and y is:
Cov(x,y)=n−11i=1∑n(xi−xˉ)(yi−yˉ)The covariance matrix generalizes covariance to multiple variables. For a dataset X with d features and n samples, the covariance matrix Σ is a d×d matrix where each entry Σij is the covariance between feature i and feature j, computed with denominator n−1 to be an unbiased estimator.
12345678910111213import numpy as np # Example data: 3 samples, 2 features X = np.array([[2.5, 2.4], [0.5, 0.7], [2.2, 2.9]]) # Center the data (subtract mean) X_centered = X - np.mean(X, axis=0) # Compute covariance matrix manually cov_matrix = (X_centered.T @ X_centered) / X_centered.shape[0] print("Covariance matrix:\n", cov_matrix)
In the code above, you manually center the data and compute the covariance matrix using matrix multiplication. This matrix captures how each pair of features varies together.
Thanks for your feedback!
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat