Correlation and Covariance
Deslize para mostrar o menu
Understanding how variables relate to each other is essential in statistics. Correlation and covariance are two fundamental measures that help you quantify these relationships. Both measure the direction of a linear relationship between two variables, but they differ in scale and interpretation.
Covariance
- Indicates whether two variables move together;
- Positive covariance: both variables increase or decrease together;
- Negative covariance: one variable increases while the other decreases;
- Magnitude depends on the scale of the variables, making comparison across datasets difficult.
Correlation
- Standardizes the relationship by dividing the covariance by the product of the variables' standard deviations;
- Results in a value between -1 and 1;
1means a perfect positive linear relationship;-1means a perfect negative linear relationship;0means no linear relationship;- Unitless, so it is easier to interpret and compare.
Calculating Correlation and Covariance in Python
You can use libraries like numpy and pandas to calculate both covariance and correlation:
- The
cov()function inpandascomputes the covariance matrix; - The
corr()function gives you the correlation matrix.
These matrices show the pairwise relationships between all variables in your dataset.
Interpreting Results
- A high correlation does not imply causation;
- Outliers can impact both measures;
- Correlation measures only linear relationships and may miss nonlinear associations.
123456789101112131415161718192021import numpy as np import pandas as pd # Realistic dataset: Student statistics data = { "hours_studied": [2, 3, 5, 7, 9, 1, 4, 8, 6, 10], "exam_score": [55, 60, 75, 85, 95, 50, 70, 90, 80, 98], "social_media": [6, 5, 4, 2, 1, 7, 5, 2, 3, 0], "hours_slept": [7, 6, 8, 7, 6, 8, 7, 6, 8, 7] } df = pd.DataFrame(data) # Compute covariance matrix cov_matrix = df.cov() print("Covariance matrix:") print(cov_matrix) # Compute correlation matrix corr_matrix = df.corr() print("\nCorrelation matrix:") print(corr_matrix)
Tudo estava claro?
Obrigado pelo seu feedback!
Seção 1. Capítulo 7
Pergunte à IA
Pergunte à IA
Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo
Seção 1. Capítulo 7