In the last chapter, we covered the concepts of **sample variance** and **adjusted sample variance**. Now let's see how with the help of simulation, we can determine that the first estimation is biased and the second is unbiased. 

We will use the Gaussian population: we will build an estimate of the sample variance and the adjusted sample variance on different subsets of the population. Next, using the law of large numbers, we will estimate the mean of the sample variance and the adjusted sample variance and **compare it with the real variance** of the population.

Statistics and probability theory are fundamental tools in data analysis, decision-making, and scientific research. They provide a systematic and quantitative way to understand and interpret data, make predictions, and draw conclusions based on evidence. Now we will consider all additional topics necessary for Data Science and Data Analytics.

Now we will understand some fundamental theoretical concepts which are used in solving real live tasks: absolutely continuous and discrete random variables, probability density function, cumulative distribution function, the characteristics of a random variable, etc.

The limit theorems of probability theory are fundamental laws of probability theory that are often used in practice in a wide variety of areas, such as: building confidence intervals, estimating distribution parameters, providing A/B testings, creating ensembles of ML models, etc. Now we will consider two of the most commonly used: the Law of Large Numbers and the Central Limit Theorem.

When we work with real data we usually do not know from which distribution this data was obtained. In order to determine this, we must be able to correctly estimate the parameters of this distribution and the type of distribution, which we will learn to do in this section.

We have already learned how to estimate the parameters of the population. But to estimate the parameter, we make an assumption about the population distribution. Can we say that our assumption is correct? How do we prove that the estimated parameters are the real parameters of the population? Can we show that two sets of samples are independent?  To answer these questions, it is necessary to consider the concept of hypothesis testing.

Challenge: Checking Bias of An Estimation Using Simulation

Solution

Challenge: Checking Bias of An Estimation Using Simulation

Solution