Course Content

# Probability Theory

3. Conducting Fascinating Experiments

5. Normal Distribution

Probability Theory

## Normal Distribution

Hi there! It is the right time to move to more complex distributions! Continuous one!

What is it?

Continuous distribution is a distribution that has an infinite number of possible outcomes. Therefore, we can not calculate the interval value or create a table because we do not know their amount. Such distributions can be expressed only with a graph.

Let's start with the most widely used and gripping one, normal distribution!

To work with this distribution we should import the `norm` object from `scipy.stats` and then we can apply numerous functions to this distribution like `sf`, `cdf`, but not `pmf`. Here is the function with the same meaning titled as `pdf`.

Examples:

1. Animals size.
2. People's heights.
3. Birth weights.

To understand the key characteristics, it is better to first look at the graph.

Distribution of imperial penguin's heights in meters.

Key characteristics:

The graph is bell-shaped due to the reason that it looks like a bell. The graph is symmetric. It has thin tails.

Graph explanation:

I guess you remember something about mean and standard deviation, so look to the mean, which equals 1.2 meters here, and the standard deviation with the value of 0.3. You can see the most bright yellow rectangle with the value mean + std (standard deviation) as the right border and mean - std (standard deviation) as the left border. The important thing is that all values between the amount mentioned above to 68.3% of all values. The number 68.3% can be called a confidence interval.

The values between mean + 2 * std and mean - 2 * std amount to 95.4% of all values.

The values between mean + 3 * std and mean - 3 * std amount to 99.7% of all values.

Confidence interval:

In our case with a mean of 1.2 and a standard deviation of 0.3 we can say that: 68.3% confidence we can say that the average imperial penguin's heigh is between 1.2 - 0.3 meters and 1.2 + 0.3 meters -> 0.9 and 1.5 meters. 95.4% confidence we can say that the average imperial penguin's heigh is between 1.2 - 2 * 0.3 meters and 1.2 + 2 * 0.3 meters -> 0.6 and 1.8 meters. 99.7% confidence we can say that the average imperial penguin's heigh is between 1.2 - 3 * 0.3 meters and 1.2 + 3 * 0.3 meters -> 0.3 and 2.1 meters.

Let's recall some functions, bit for normal distribution (they are a little bit different):

For outputting random sample: `norm.rvs(loc, scale, size)`.

For calculating the probability of receiving exactly `x` events: `norm.pdf(x, loc, scale)`.

For calculating the probability of receiving `x` or more events: `norm.sf(x, loc, scale)`.

For calculating the probability of receiving `x`or less events: `norm.cdf(x, loc, scale)`.

• `loc` is the mean value of the distribution.
• `scale` is the standard deviation value of the distribution.
• `size` is the number of samples of the distribution.
• `x` is the number of expected results.

Here build the random distribution of the cat's weights! Follow the algorithm:

1. Import `norm` object from `scipy.stats`.
2. Import `matplotlib.pyplot` with `plt` alias.
3. Import `seaborn` with `sns` alias.
4. Generate random normal distribution with the attributes:
• Mean equals `4.2`.
• Standard deviation equals `1`.
5. Create a histplot with such parameters:
• `dist` variable to the `data` attribute.
• `True` variable to the `kde` attribute.
6. Output the graph.

Everything was clear?

Section 5. Chapter 1