Course Content
Probability Theory Basics
Probability Theory Basics
5. Covariance and Correlation
What is Correlation?
Correlation is a statistical measure that quantifies the relationship between two variables. It is determined as the scaled covariation and due to this scale, we can determine the measure of dependencies in addition to their direction.
Correlation ranges between -1
and 1
, where:
- If the correlation is
+1
then values have a perfect positive linear relationship. As one variable increases, the other variable increases proportionally; - If the correlation is
-1
then values have a perfect negative linear relationship. As one variable increases, the other variable decreases proportionally; - If the correlation coefficient is close to
0
then there is no linear relationship between the variables.
To calculate the correlation we can follow the same steps as to calculate covariance and use np.corrcoef(x, y)[0, 1]
.
import matplotlib.pyplot as plt import numpy as np # Create a figure with three subplots fig, axes = plt.subplots(1, 3) fig.set_size_inches(10, 5) # Positive linear dependence x = np.random.rand(100) * 10 # Generate random x values y = x + np.random.randn(100) # Generate y values with added noise axes[0].scatter(x, y) # Scatter plot of x and y axes[0].set_title('Correlation is '+ str(round(np.corrcoef(x, y)[0, 1], 3) )) # Set title with correlation coefficient # Negative linear dependence x = np.random.rand(100) * 10 # Generate random x values y = -x + np.random.randn(100) # Generate y values with added noise axes[1].scatter(x, y) # Scatter plot of x and y axes[1].set_title('Correlation is '+ str(round(np.corrcoef(x, y)[0, 1], 3) )) # Set title with correlation coefficient # Independent np.random.seed(0) # Set random seed for reproducibility x = np.random.rand(200) # Generate random x values y = np.random.rand(200) # Generate random y values axes[2].scatter(x, y) # Scatter plot of x and y axes[2].set_title('Correlation is '+ str(round(np.corrcoef(x, y)[0, 1], 3) )) # Set title with correlation coefficient plt.show() # Display the plot
Everything was clear?
Thanks for your feedback!
Section 5. Chapter 2