Aprende Law of Large Numbers for Bernoulli Process | The Limit Theorems of Probability Theory

A Bernoulli trial is a statistical experiment with only two possible outcomes, usually success and failure, with fixed probabilities of occurrence on each trial. It was considered in more detail in the Probability Theory Basics course.

In a Bernoulli process, each trial is independent, meaning the outcome of one trial does not affect the outcome of any other trial. The probability of success, denoted by p, is the same for every trial. The probability of failure is indicated by q = 1 - p.
Let's try to apply the law of large numbers to this scheme. Assume that we provide n experiments and want to calculate the total number of successful results. According to the law of large numbers law, we can do it as follows:

Each variable in the numerator represents the outcome of one experiment: it's 1 if the experiment succeeds (with probability p) and 0 if it fails (with probability 1-p).

In this case, the conditions of the law of large numbers are met: the variables are independent (as the experiments are independent), identically distributed, and have a finite expectation (as shown by the distribution series).

Therefore, we can use the law of large numbers to estimate the probabilities of an event's occurrence by analyzing the frequency of its occurrence.

For example, let's consider flipping a coin with a displaced center of gravity. Our goal is to estimate the probability of it landing heads up. Check out the code below:


              1234567891011121314151617181920212223
            
import numpy as np
import matplotlib.pyplot as plt
# Set the probability of heads to 0.3
p = 0.3

# Generate 2000 flips of the coin with probability of heads equal to `p`
coin_flips = np.random.choice([1, 0], size=2000, p=[p, 1-p])
# Function that will calculate mean value of subsamples
def mean_value(data, subsample_size):
  return data[:subsample_size].mean()

# Visualizing the results
x = np.arange(2000)
y = np.zeros(2000)
for i in range(1, 2000):
  y[i] = mean_value(coin_flips, x[i])

plt.plot(x, y, label='Estimated probability')
plt.xlabel('Number of elements to calculate probability')
plt.ylabel('Probability of success')
plt.axhline(y=p, color='k', label='Real probability of success')
plt.legend()
plt.show()

Similarly, the law of large numbers can be generalized for a polynomial scheme: for 1, we consider the occurrence of the event/events of interest to us, and for 0, all other results. Let's look at an example:


              1234567891011121314151617181920212223242526272829303132
            
import numpy as np
import matplotlib.pyplot as plt

# Our distribution with 4 possible values
outcomes = ['Red', 'Blue', 'Black', 'Green']
# Probabilities of corresponding values
probs = [0.3, 0.2, 0.4, 0.1]

# Generate samples
samples = np.random.choice(outcomes, size=2000, p=probs)

# Suppose we want to determine the probability of occurrence of red or black colors. 
# Let's transform the data in such a way that 1 stands in place of 'Red' and 'Black' colors, 
# and 0 in place of other colors
encoded_samples = np.where(np.logical_or(samples == 'Red', samples == 'Black'), 1, 0)

# Function that will calculate mean value of subsamples
def mean_value(data, subsample_size):
  return data[:subsample_size].mean()

# Visualizing the results
x = np.arange(2000)
y = np.zeros(2000)
for i in range(1, 2000):
  y[i] = mean_value(encoded_samples, x[i])

plt.plot(x, y, label='Estimated probability')
plt.xlabel('Number of elements to calculate probability')
plt.ylabel('Probability of success')
plt.axhline(y=probs[0]+probs[2], color='k', label='Real probability of success')
plt.legend()
plt.show()

¿Todo estuvo claro?

¡Gracias por tus comentarios!

Sección 2. Capítulo 2

Pregunte a AI

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla

Desliza para mostrar el menú