Learn Consistent Estimation | Estimation of Population Parameters

In statistics, a consistent estimation is an estimation that converges to the true value of the parameter as the sample size increases, meaning that the estimation becomes more and more accurate as more data is collected. Formally it can be described as follows:

This definition may seem rather complicated. In addition, in practice, it is not always easy to check the consistency of an estimate in this way, that is why we will introduce a simpler applied criterion of consistency:

Thus, if our estimator is asymptotically unbiased or simply unbiased and the estimator's variance decreases with increasing sample size, then such an estimator is consistent.

Let's show that the estimates of the sample mean and adjusted sample variance are consistent.

Sample mean estimation

The sample mean estimation is consistent by definition due to the law of large numbers: the more terms we include to calculate mean value, the closer the resulting value tends to the mathematical expectation.

Adjusted sample variance estimation

To check the consistency of adjusted sample variance let's use simulation:


              1234567891011121314151617181920212223
            
import numpy as np
import matplotlib.pyplot as plt

# Generate 5000 samples from a normal distribution with mean 2 and standard deviation 2
samples = np.random.normal(2, 2, 5000)

# Function to calculate adjusted variance of subsamples
def adjusted_variance_value(data, subsample_size):
    return samples[:subsample_size].var(ddof=1)  # Calculate the adjusted variance using Bessel's correction

# Visualizing the results
x = np.arange(2, 5000)  # Generate values for the number of elements to calculate variance
y = np.zeros(4998)  # Initialize an array to store the calculated variances
for i in range(4998):  # Loop through the range of subsample sizes
    y[i] = adjusted_variance_value(samples, x[i])  # Calculate adjusted variance for each subsample size

# Plotting the results
plt.plot(x, y, label='Estimated adjusted variance')  # Plot estimated adjusted variance
plt.xlabel('Number of elements to calculate variance')  # Set x-axis label
plt.ylabel('Variance')  # Set y-axis label
plt.axhline(y=4, color='k', label='Real variance')  # Add a horizontal line representing the real variance
plt.legend()  # Add legend to the plot
plt.show()  # Display the result

According to the visualization, we can see that as the number of elements increases, the adjusted sample variance tends to its real value, so the estimate is consistent.

Everything was clear?

Thanks for your feedback!

Section 3. Chapter 6

Ask AI

Ask anything or try one of the suggested questions to begin our chat