Course Content

# Probability Theory Basics

4. Commonly Used Continuous Distributions

5. Covariance and Correlation

Probability Theory Basics

## Sample Variance and Standard Deviation

## Sample variance

**Sample variance** is a statistical measure that quantifies the spread or dispersion of a set of data points in a dataset from their mean value.

We can calculate it using the following **formula**:

Note

In this context,

sample sizeandsample meanrefer to characteristics calculated based on the existing dataset.

You should keep in mind a **fundamental empirical rule**: the greater the sample variance, the more spread out the data is.

### How to calculate sample variance in Python?

In **NumPy** you need to put the sequence of values (in our case, the column of the dataset) into the function `np.var()`

like `np.var(df['salary'])`

to calculate sample variance.

## Standard deviation

This value is similar to the variance because **standard deviation** is a square root of the variance.

We can calculate it using `np.std()`

function using NumPy library.

Note

All characteristics of datasets are considered in Probability Theory Mastering course in more detail! In this course there is also explored the connection between the concepts of probability theory and the statistical properties of the data, understanding how these concepts are interrelated.

Everything was clear?