Course Content

# Learning Statistics with Python

2. Mean, Median and Mode with Python

4. Covariance vs Correlation

Learning Statistics with Python

## Covariance

**Covariance** is a measure of the joint variability of two random variables.

The value of covariance | Meaning |

Positive | Two variables move in the same direction |

0 | Two variables no linear relationship |

Negative | Two variables move in opposite directions |

The formulas are different for the **sample** and **population**, but we will not dive deeper into them. In this chapter, we will discuss covariances of the following dataset:

Store_ID | Store_Area | Items_Available | Daily_Customer_Count | Store_Sales | |

0 | 0 | 1659 | 1961 | 530 | 66490 |

1 | 1 | 1461 | 1752 | 210 | 39820 |

2 | 2 | 1340 | 1609 | 720 | 54010 |

3 | 3 | 1451 | 1748 | 620 | 53730 |

4 | 4 | 1770 | 2111 | 450 | 46620 |

`Store_ID`

- The unique id of the store.`Store_Area`

- The area of the store.`Items_Available`

- The number of items that are available in the store.`Daily_Customer_Count`

- The daily number of customers in the store.`Store_Sales`

- The number of sales in the store.

**Calculating Covariance with Python:**

To compute covariance in Python, you can use the `np.cov()`

function from the **NumPy** library. It requires two parameters: the sequences of data for which you want to calculate the covariance.

The result is the value at index [0,1]. This course won't cover the other values in the output, refer to the example:

This indicates that the values move in the same direction. This makes sense because a larger store area corresponds to a greater number of items. One significant drawback of covariance is that the value can be infinite.

Everything was clear?