Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära What is Correlation? | Covariance and Correlation
Probability Theory Basics

book
What is Correlation?

Correlation is a statistical measure that quantifies the relationship between two variables. It is determined as the scaled covariation and due to this scale, we can determine the measure of dependencies in addition to their direction.
Correlation ranges between -1 and 1, where:

  1. If the correlation is +1 then values have a perfect positive linear relationship. As one variable increases, the other variable increases proportionally;

  2. If the correlation is -1 then values have a perfect negative linear relationship. As one variable increases, the other variable decreases proportionally;

  3. If the correlation coefficient is close to 0 then there is no linear relationship between the variables.

To calculate the correlation we can follow the same steps as to calculate covariance and use np.corrcoef(x, y)[0, 1].

import matplotlib.pyplot as plt
import numpy as np

# Create a figure with three subplots
fig, axes = plt.subplots(1, 3)
fig.set_size_inches(10, 5)

# Positive linear dependence
x = np.random.rand(100) * 10 # Generate random x values
y = x + np.random.randn(100) # Generate y values with added noise
axes[0].scatter(x, y) # Scatter plot of x and y
axes[0].set_title('Correlation is '+ str(round(np.corrcoef(x, y)[0, 1], 3) )) # Set title with correlation coefficient

# Negative linear dependence
x = np.random.rand(100) * 10 # Generate random x values
y = -x + np.random.randn(100) # Generate y values with added noise
axes[1].scatter(x, y) # Scatter plot of x and y
axes[1].set_title('Correlation is '+ str(round(np.corrcoef(x, y)[0, 1], 3) )) # Set title with correlation coefficient

# Independent
np.random.seed(0) # Set random seed for reproducibility
x = np.random.rand(200) # Generate random x values
y = np.random.rand(200) # Generate random y values
axes[2].scatter(x, y) # Scatter plot of x and y
axes[2].set_title('Correlation is '+ str(round(np.corrcoef(x, y)[0, 1], 3) )) # Set title with correlation coefficient

plt.show() # Display the plot
123456789101112131415161718192021222324252627
import matplotlib.pyplot as plt import numpy as np # Create a figure with three subplots fig, axes = plt.subplots(1, 3) fig.set_size_inches(10, 5) # Positive linear dependence x = np.random.rand(100) * 10 # Generate random x values y = x + np.random.randn(100) # Generate y values with added noise axes[0].scatter(x, y) # Scatter plot of x and y axes[0].set_title('Correlation is '+ str(round(np.corrcoef(x, y)[0, 1], 3) )) # Set title with correlation coefficient # Negative linear dependence x = np.random.rand(100) * 10 # Generate random x values y = -x + np.random.randn(100) # Generate y values with added noise axes[1].scatter(x, y) # Scatter plot of x and y axes[1].set_title('Correlation is '+ str(round(np.corrcoef(x, y)[0, 1], 3) )) # Set title with correlation coefficient # Independent np.random.seed(0) # Set random seed for reproducibility x = np.random.rand(200) # Generate random x values y = np.random.rand(200) # Generate random y values axes[2].scatter(x, y) # Scatter plot of x and y axes[2].set_title('Correlation is '+ str(round(np.corrcoef(x, y)[0, 1], 3) )) # Set title with correlation coefficient plt.show() # Display the plot
copy

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 5. Kapitel 2

Fråga AI

expand

Fråga AI

ChatGPT

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

some-alt