Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Implementing Spread in Python | Probability & Statistics
Mathematics for Data Science

bookImplementing Spread in Python

Define the Dataset

Here, we assign an array to the variable data to ensure we have a consistent dataset to work with for all calculations.

import numpy as np

# Create a numpy array of daily sales
data = np.array([10, 15, 12, 18, 20, 22, 14, 17, 11, 16])

Calculate Population Statistics

This function takes the array as input and returns the average value of all elements, which summarizes the central tendency of the dataset.

mean_val = np.mean(data)       # Mean
variance_val = np.var(data)    # Population variance (ddof=0 by default)
std_dev_val = np.std(data)     # Population standard deviation
  • np.mean(data) computes the arithmetic mean (average);
  • np.var(data) calculates the population variance (divides by nn);
  • np.std(data) calculates the population standard deviation (square root of variance).
123456789101112
import numpy as np # Create a numpy array of daily sales data = np.array([10, 15, 12, 18, 20, 22, 14, 17, 11, 16]) mean_val = np.mean(data) # Mean variance_val = np.var(data) # Population variance (ddof=0 by default) std_dev_val = np.std(data) # Population standard deviation print(f"Mean: {mean_val}") print(f"Variance (Population): {variance_val}") print(f"Standard Deviation (Population): {std_dev_val}")
copy

Calculate Sample Statistics

To get unbiased estimates from a sample, we use ddof=1. This applies Bessel's correction, dividing variance by $(n-1)$ instead of $n$.

sample_variance_val = np.var(data, ddof=1)
sample_std_dev_val = np.std(data, ddof=1)
  • np.var(data, ddof=1) - sample variance;
  • np.std(data, ddof=1) - sample standard deviation.
12345678910
import numpy as np # Create a numpy array of daily sales data = np.array([10, 15, 12, 18, 20, 22, 14, 17, 11, 16]) sample_variance_val = np.var(data, ddof=1) sample_std_dev_val = np.std(data, ddof=1) print(f"Variance (Sample): {sample_variance_val}") print(f"Standard Deviation (Sample): {sample_std_dev_val}")
copy
Note
Note

Standard deviation is the square root of variance, giving a measure of spread in the same units as the original data, making it easier to interpret.

question mark

How do we calculate standard deviation with numpy library?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 5. ChapterΒ 8

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain the difference between population and sample statistics again?

Why do we use Bessel's correction (ddof=1) for sample statistics?

How do these statistics help in real business scenarios?

Awesome!

Completion rate improved to 1.96

bookImplementing Spread in Python

Swipe to show menu

Define the Dataset

Here, we assign an array to the variable data to ensure we have a consistent dataset to work with for all calculations.

import numpy as np

# Create a numpy array of daily sales
data = np.array([10, 15, 12, 18, 20, 22, 14, 17, 11, 16])

Calculate Population Statistics

This function takes the array as input and returns the average value of all elements, which summarizes the central tendency of the dataset.

mean_val = np.mean(data)       # Mean
variance_val = np.var(data)    # Population variance (ddof=0 by default)
std_dev_val = np.std(data)     # Population standard deviation
  • np.mean(data) computes the arithmetic mean (average);
  • np.var(data) calculates the population variance (divides by nn);
  • np.std(data) calculates the population standard deviation (square root of variance).
123456789101112
import numpy as np # Create a numpy array of daily sales data = np.array([10, 15, 12, 18, 20, 22, 14, 17, 11, 16]) mean_val = np.mean(data) # Mean variance_val = np.var(data) # Population variance (ddof=0 by default) std_dev_val = np.std(data) # Population standard deviation print(f"Mean: {mean_val}") print(f"Variance (Population): {variance_val}") print(f"Standard Deviation (Population): {std_dev_val}")
copy

Calculate Sample Statistics

To get unbiased estimates from a sample, we use ddof=1. This applies Bessel's correction, dividing variance by $(n-1)$ instead of $n$.

sample_variance_val = np.var(data, ddof=1)
sample_std_dev_val = np.std(data, ddof=1)
  • np.var(data, ddof=1) - sample variance;
  • np.std(data, ddof=1) - sample standard deviation.
12345678910
import numpy as np # Create a numpy array of daily sales data = np.array([10, 15, 12, 18, 20, 22, 14, 17, 11, 16]) sample_variance_val = np.var(data, ddof=1) sample_std_dev_val = np.std(data, ddof=1) print(f"Variance (Sample): {sample_variance_val}") print(f"Standard Deviation (Sample): {sample_std_dev_val}")
copy
Note
Note

Standard deviation is the square root of variance, giving a measure of spread in the same units as the original data, making it easier to interpret.

question mark

How do we calculate standard deviation with numpy library?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 5. ChapterΒ 8
some-alt