Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Вивчайте Covariance | Section
Statistics for Data Analysis

bookCovariance

Свайпніть щоб показати меню

Note
Definition

Covariance is a measure of the joint variability of two random variables.

The formulas for sample and population covariance differ, but they will not be discussed in detail here. This chapter focuses on calculating the covariance for the following dataset:

  • Store_ID: the unique id of the store;
  • Store_Area: the area of the store;
  • Items_Available: the number of items that are available in the store;
  • Daily_Customer_Count: the daily number of customers in the store;
  • Store_Sales: the number of sales in the store.

Calculating Covariance with Python

To compute covariance in Python, use the np.cov() function from the NumPy library. It takes two parameters: the data sequences for which you want to calculate the covariance.

The result is the value at index [0,1]. This course won't cover the other values in the output, refer to the example:

123456789
import pandas as pd import numpy as np df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a849660e-ddfa-4033-80a6-94a1b7772e23/update/Stores.csv') # Calculating covariance cov = np.cov(df['Store_Area'], df['Items_Available'])[0,1] print(round(cov, 2))
copy

This indicates that the values move in the same direction. This makes sense because a larger store area corresponds to a greater number of items. One significant drawback of covariance is that the value can be infinite.

question mark

Which statements about covariance are correct?

Select all correct answers

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 1. Розділ 19

Запитати АІ

expand

Запитати АІ

ChatGPT

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

Секція 1. Розділ 19
some-alt