Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprenda t-test Math | What Is Hypothesis Testing?
Quizzes & Challenges
Quizzes
Challenges
/
Applied Hypothesis Testing & A/B Testing

bookt-test Math

Understanding the math behind the t-test is essential for applying it confidently in real-world A/B testing scenarios. The t-test helps you compare the means of two independent samples to determine if any observed difference is statistically significant, or if it could have occurred by random chance. To do this, you must calculate the t-statistic, which measures how many standard errors the difference in sample means is away from zero under the null hypothesis.

The formula for the t-statistic in the case of two independent samples is:

t=xˉ1xˉ2s12n1+s22n2t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}

where:

  • xˉ1\bar{x}_1 and xˉ2\bar{x}_2 are the sample means for group 1 and group 2;
  • s12s_{\raisebox{-1pt}{$1$}}^{\raisebox{1pt}{$2$}} and s22s_{\raisebox{-1pt}{$2$}}^{\raisebox{1pt}{$2$}} are the sample variances;
  • n1n_1 and n2n_2 are the sample sizes.

The denominator combines the estimated variances from both groups, scaled by their respective sample sizes, to calculate the standard error of the difference in means. This formula assumes the two samples are independent and may have unequal variances.

To determine the significance of your t-statistic, you also need the degrees of freedom (df), which affect the shape of the t-distribution used to interpret your result. For two samples with possibly unequal variances, the Welch-Satterthwaite equation provides an approximation:

df=(s12n1+s22n2)2(s12/n1)2n11+(s22/n2)2n21df = \frac{\left(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}\right)^2}{\frac{(s_1^2/n_1)^2}{n_1-1} + \frac{(s_2^2/n_2)^2}{n_2-1}}

This approach is robust even if the sample sizes or variances are not equal, which is common in practical A/B testing.

123456789101112131415161718192021222324252627282930313233
import numpy as np # Sample data for two independent groups group1 = np.array([23, 21, 19, 24, 25, 22]) group2 = np.array([30, 28, 27, 31, 29, 32]) # Calculate sample means mean1 = np.mean(group1) mean2 = np.mean(group2) # Calculate sample variances (ddof=1 for sample variance) var1 = np.var(group1, ddof=1) var2 = np.var(group2, ddof=1) # Sample sizes n1 = len(group1) n2 = len(group2) # Calculate t-statistic se = np.sqrt(var1/n1 + var2/n2) t_statistic = (mean1 - mean2) / se # Calculate degrees of freedom using Welch-Satterthwaite equation numerator = (var1/n1 + var2/n2) ** 2 denominator = ((var1/n1)**2) / (n1 - 1) + ((var2/n2)**2) / (n2 - 1) df = numerator / denominator print(f"Sample mean 1: {mean1:.2f}") print(f"Sample mean 2: {mean2:.2f}") print(f"Sample variance 1: {var1:.2f}") print(f"Sample variance 2: {var2:.2f}") print(f"t-statistic: {t_statistic:.3f}") print(f"Degrees of freedom: {df:.2f}")
copy

After calculating the t-statistic and degrees of freedom, interpret your results as follows:

  • Compare your t-statistic to critical values from the t-distribution, or calculate a p-value;
  • If the absolute value of your t-statistic is large (given the degrees of freedom), the observed difference in sample means is unlikely due to random chance, so you may reject the null hypothesis;
  • If the t-statistic is small, the data does not provide strong evidence against the null hypothesis, and you cannot conclude the group means are different.

This process is essential for hypothesis testing and supports the reliability of conclusions in A/B testing.

question mark

Which statement best describes the role of the t-statistic in hypothesis testing for two independent samples?

Select the correct answer

Tudo estava claro?

Como podemos melhorá-lo?

Obrigado pelo seu feedback!

Seção 1. Capítulo 4

Pergunte à IA

expand

Pergunte à IA

ChatGPT

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo

Awesome!

Completion rate improved to 3.23

bookt-test Math

Deslize para mostrar o menu

Understanding the math behind the t-test is essential for applying it confidently in real-world A/B testing scenarios. The t-test helps you compare the means of two independent samples to determine if any observed difference is statistically significant, or if it could have occurred by random chance. To do this, you must calculate the t-statistic, which measures how many standard errors the difference in sample means is away from zero under the null hypothesis.

The formula for the t-statistic in the case of two independent samples is:

t=xˉ1xˉ2s12n1+s22n2t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}

where:

  • xˉ1\bar{x}_1 and xˉ2\bar{x}_2 are the sample means for group 1 and group 2;
  • s12s_{\raisebox{-1pt}{$1$}}^{\raisebox{1pt}{$2$}} and s22s_{\raisebox{-1pt}{$2$}}^{\raisebox{1pt}{$2$}} are the sample variances;
  • n1n_1 and n2n_2 are the sample sizes.

The denominator combines the estimated variances from both groups, scaled by their respective sample sizes, to calculate the standard error of the difference in means. This formula assumes the two samples are independent and may have unequal variances.

To determine the significance of your t-statistic, you also need the degrees of freedom (df), which affect the shape of the t-distribution used to interpret your result. For two samples with possibly unequal variances, the Welch-Satterthwaite equation provides an approximation:

df=(s12n1+s22n2)2(s12/n1)2n11+(s22/n2)2n21df = \frac{\left(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}\right)^2}{\frac{(s_1^2/n_1)^2}{n_1-1} + \frac{(s_2^2/n_2)^2}{n_2-1}}

This approach is robust even if the sample sizes or variances are not equal, which is common in practical A/B testing.

123456789101112131415161718192021222324252627282930313233
import numpy as np # Sample data for two independent groups group1 = np.array([23, 21, 19, 24, 25, 22]) group2 = np.array([30, 28, 27, 31, 29, 32]) # Calculate sample means mean1 = np.mean(group1) mean2 = np.mean(group2) # Calculate sample variances (ddof=1 for sample variance) var1 = np.var(group1, ddof=1) var2 = np.var(group2, ddof=1) # Sample sizes n1 = len(group1) n2 = len(group2) # Calculate t-statistic se = np.sqrt(var1/n1 + var2/n2) t_statistic = (mean1 - mean2) / se # Calculate degrees of freedom using Welch-Satterthwaite equation numerator = (var1/n1 + var2/n2) ** 2 denominator = ((var1/n1)**2) / (n1 - 1) + ((var2/n2)**2) / (n2 - 1) df = numerator / denominator print(f"Sample mean 1: {mean1:.2f}") print(f"Sample mean 2: {mean2:.2f}") print(f"Sample variance 1: {var1:.2f}") print(f"Sample variance 2: {var2:.2f}") print(f"t-statistic: {t_statistic:.3f}") print(f"Degrees of freedom: {df:.2f}")
copy

After calculating the t-statistic and degrees of freedom, interpret your results as follows:

  • Compare your t-statistic to critical values from the t-distribution, or calculate a p-value;
  • If the absolute value of your t-statistic is large (given the degrees of freedom), the observed difference in sample means is unlikely due to random chance, so you may reject the null hypothesis;
  • If the t-statistic is small, the data does not provide strong evidence against the null hypothesis, and you cannot conclude the group means are different.

This process is essential for hypothesis testing and supports the reliability of conclusions in A/B testing.

question mark

Which statement best describes the role of the t-statistic in hypothesis testing for two independent samples?

Select the correct answer

Tudo estava claro?

Como podemos melhorá-lo?

Obrigado pelo seu feedback!

Seção 1. Capítulo 4
some-alt