Course Content

# Probability Theory

3. Conducting Fascinating Experiments

5. Normal Distribution

Probability Theory

## Normal Distribution

Hi there! It is the right time to move to more complex distributions! **Continuous** one!

**What is it?**

**Continuous distribution** is a distribution that has an infinite number of possible outcomes. Therefore, we can not calculate the interval value or create a table because we do not know their amount. Such distributions can be expressed only with a graph.

Let's start with the most widely used and gripping one, **normal distribution**!

To work with this distribution we should import the `norm`

object from `scipy.stats`

and then we can apply numerous functions to this distribution like `sf`

, `cdf`

, but not `pmf`

. Here is the function with the same meaning titled as `pdf`

.

**Examples:**

- Animals size.
- People's heights.
- Birth weights.

To understand the key characteristics, it is better to first look at the graph.

Distribution of imperial penguin's heights in meters.

**Key characteristics:**

The graph is bell-shaped due to the reason that it looks like a bell.
The graph is **symmetric**.
It has thin tails.

**Graph explanation:**

I guess you remember something about mean and standard deviation, so look to the **mean**, which equals **1.2 meters** here, and the standard deviation with the value of **0.3**. You can see the most bright yellow rectangle with the value **mean + std** (standard deviation) as the *right* border and **mean - std** (standard deviation) as the *left* border. **The important thing** is that all values between the amount mentioned above to **68.3%** of all values. The number **68.3%** can be called a *confidence interval*.

The values between **mean + 2 * std** and **mean - 2 * std** amount to **95.4%** of all values.

The values between **mean + 3 * std** and **mean - 3 * std** amount to **99.7%** of all values.

**Confidence interval:**

In our case with a mean of 1.2 and a standard deviation of 0.3 we can say that:
**68.3%** confidence we can say that the **average imperial penguin's heigh** is between **1.2 - 0.3** meters and **1.2 + 0.3** meters -> **0.9 and 1.5** meters.
**95.4%** confidence we can say that the **average imperial penguin's heigh** is between **1.2 - 2 * 0.3** meters and **1.2 + 2 * 0.3** meters -> **0.6 and 1.8** meters.
**99.7%** confidence we can say that the **average imperial penguin's heigh** is between **1.2 - 3 * 0.3** meters and **1.2 + 3 * 0.3** meters -> **0.3 and 2.1** meters.

Let's recall some functions, bit for normal distribution (they are a little bit different):

For outputting random sample: `norm.rvs(loc, scale, size)`

.

For calculating the probability of receiving exactly `x`

events: `norm.pdf(x, loc, scale)`

.

For calculating the probability of receiving `x`

or **more** events: `norm.sf(x, loc, scale)`

.

For calculating the probability of receiving `x`

or **less** events: `norm.cdf(x, loc, scale)`

.

`loc`

is the**mean**value of the distribution.`scale`

is the**standard deviation**value of the distribution.`size`

is the number of samples of the distribution.`x`

is the number of expected results.

# Task

Here build the **random** distribution of the **cat's weights**! Follow the algorithm:

- Import
`norm`

object from`scipy.stats`

. - Import
`matplotlib.pyplot`

with`plt`

alias. - Import
`seaborn`

with`sns`

alias. - Generate random normal distribution with the attributes:
**Mean**equals`4.2`

.**Standard deviation**equals`1`

.

- Create a
**histplot**with such parameters:`dist`

variable to the`data`

attribute.`True`

variable to the`kde`

attribute.

- Output the graph.

Everything was clear?