Contenido del Curso

Advanced Probability Theory

1. Additional Statements From The Probability Theory

Course Overview Absolutely Continuous and Discrete Random Variables Cumulative Distribution Functions and Probability Density Functions Characteristics of Random Variables Random Vectors Useful Properties of the Gaussian Distribution Challenge: Detecting Outliers Using 3-Sigma Rule

2. The Limit Theorems of Probability Theory

Law of Large Numbers Law of Large Numbers for Bernoulli Process Challenge: Estimate Mean Value Using Law of Large Numbers Central Limit Theorem Challenge: Application of the CLT to Solving Real Problem

3. Estimation of Population Parameters

General population. Samples. Population parameters.Momentum estimation. Maximum Likelihood Estimation Challenge: Estimate Parameters of Chi-square Distribution Unbiased Estimation Challenge: Checking Bias of An Estimation Using Simulation Consistent Estimation Efficient Estimation Confidence Intervals for Population Parameters Challenge: Confidence Interval for Exponential Distribution Parameter

4. Testing of Statistical Hypotheses

What is Statistic Hypothesis? Type 1 and Type 2 Errors What is P-value?Comparing Means of Two Different Datasets Challenge: Using CLT to Compare Mean Values of Non-Gaussian Datasets Challenge: Resampling Approach to Compare Mean Values of the Datasets Testing the Hypothesis of Independence of Two Random Variables

Useful Properties of the Gaussian Distribution

The Gaussian distribution (also called normal distribution) is one of the most important distributions in probability theory and statistics. Now we will look at some useful properties of this distribution and understand why it is so important and how it is applied in real life.

Physical meaning of the Gaussian distribution

The Gaussian distribution can describe a random variable that results from many different factors adding up.

For example, when weighing something, various factors like temperature, pressure, and measurement errors affect the result. Individually, these factors don't matter much, but together they have a significant impact. This is explained further in the chapter on the Central Limit Theorem.

Let's see how we will denote the Gaussian quantities in the future:

Linear transformations of Gaussian vectors

Gaussian distribution is preserved under linear transformations of random variables: if we apply a linear transformation to a Gaussian value, we will also get a Gaussian value at the output, but with different characteristics.

Uncorrelated Gaussian variables are independent

We know that correlation shows only the presence of linear dependencies between variables: as a result variables can be dependent but not correlated. But in the case of Gaussian variables, zero correlation means that the variables are independent, which is also a very useful property of Gaussian distribution.

3-sigma rule

The 3-sigma rule, also known as the empirical rule or the 68-95-99.7 rule, is a statistical guideline that states that for a normal distribution:

Approximately 68% of the data falls within one standard deviation (σ) of the mean (μ);
Approximately 95% of the data falls within two standard deviations (2σ) of the mean (μ);
Approximately 99.7% of the data falls within three standard deviations (3σ) of the mean (μ). This rule can be very useful for detecting outliers for the data that has Gaussian distribution.


              1234567891011121314151617181920212223242526
            
import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt

# Generate some data from a normal distribution
mu = 0   # mean
sigma = 1   # standard deviation
x = np.linspace(mu - 4*sigma, mu + 4*sigma, 100)
y = norm.pdf(x, mu, sigma)

# Plot the PDF of the normal distribution
plt.plot(x, y, label='PDF')

# Shade the area within 1, 2, and 3 standard deviations of the mean
plt.fill_between(x, 0, y, where=(x >= mu-sigma) & (x <= mu+sigma), alpha=0.3, label='68%')
plt.fill_between(x, 0, y, where=(x >= mu-2*sigma) & (x <= mu+2*sigma), alpha=0.3, label='95%')
plt.fill_between(x, 0, y, where=(x >= mu-3*sigma) & (x <= mu+3*sigma), alpha=0.3, label='99.7%')

# Add a legend and labels
plt.legend()
plt.xlabel('X')
plt.ylabel('PDF')
plt.title('3-Sigma Rule for a Gaussian Distribution')

# Show the plot
plt.show()

¿Todo estuvo claro?

¡Gracias por tus comentarios!

Sección 1. Capítulo 6

Pregunte a AI

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla