Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprende Challenge: Estimate Mean Value Using Law of Large Numbers | The Limit Theorems of Probability Theory
Advanced Probability Theory

book
Challenge: Estimate Mean Value Using Law of Large Numbers

Tarea

Swipe to start coding

Assume that we have some data samples: we know these samples are independent and identically distributed, but we do not know the characteristics.

Your task is to use the law of large numbers to estimate the expected value of these samples.
We will also try to check the assumption that our data has exponential distribution: we will build a histogram based on our data and compare it with the real PDF of the exponential distribution.

Note

Visualization cannot prove that the data is distributed in a certain way. For this, it is necessary to use statistical tests, which will be considered in the last section of this course; however, with the help of visualization, we can at least roughly determine which class of distributions our data belongs to.

Your task is to:

  1. Plot histogram using .hist() method of matplotlib.pyplot module.
  2. Calculate the mean over a given subsample in mean_value function using .mean() method.
  3. Pass exp_samples as an argument of a function to calculate mean values over all subsamples.
  4. Print the estimated mean value of all samples as the last value of y array.

Solución

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
exp_samples = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/Advanced+Probability+course+media/expon_samples.csv', names = ['Value'])

# Plot a histogram of the samples and pdf of expon distribution
fig, axes = plt.subplots(1,2)
fig.set_size_inches(10, 5)
axes[0].hist(exp_samples.values, bins=10, alpha=0.5, edgecolor='black', density=True)
axes[0].set_title('Histogram of Data Samples')

# Define the range of x values to plot
x = np.linspace(0, 10, 1000)
# Calculate the PDF of the exponential distribution at each x value
pdf = 0.5 * np.exp(-0.5 * x)
axes[1].plot(x, pdf)
axes[1].set_title('Exponential Distribution PDF')
plt.show()


# Function that will calculate mean value of subsamples
def mean_value(data, subsample_size):
return data[:subsample_size].mean()['Value']

# Visualizing the results
x = np.arange(5000)
y = np.zeros(5000)
for i in range(1, 5000):
y[i] = mean_value(exp_samples, x[i])

plt.plot(x, y, label='Estimated mean')
plt.xlabel('Number of elements to calculate mean value')
plt.ylabel('Mean value')
plt.title('Mean value of the samples')
plt.legend()
plt.show()

¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 2. Capítulo 3
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
exp_samples = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/Advanced+Probability+course+media/expon_samples.csv', names = ['Value'])

# Plot a histogram of the samples and pdf of expon distribution
fig, axes = plt.subplots(1,2)
fig.set_size_inches(10, 5)
axes[0].___(exp_samples.values, bins=10, alpha=0.5, edgecolor='black', density=True)
axes[0].set_title('Histogram of Data Samples')

# Define the range of x values to plot
x = np.linspace(0, 10, 1000)
# Calculate the PDF of the exponential distribution at each x value
pdf = 0.5 * np.exp(-0.5 * x)
axes[1].plot(x, pdf)
axes[1].set_title('Exponential Distribution PDF')
plt.show()


# Function that will calculate mean value of subsamples
def mean_value(data, subsample_size):
return data[:subsample_size].___()['Value']

# Visualizing the results
x = np.arange(5000)
y = np.zeros(5000)
for i in range(1, 5000):
y[i] = mean_value(___, x[i])

plt.plot(x, y, label='Estimated mean')
plt.xlabel('Number of elements to calculate mean value')
plt.ylabel('Mean value')
plt.title('Mean value of the samples')
plt.legend()
plt.show()

Pregunte a AI

expand
ChatGPT

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla

some-alt