t-test Math
Understanding the math behind the t-test is essential for applying it confidently in real-world A/B testing scenarios. The t-test helps you compare the means of two independent samples to determine if any observed difference is statistically significant, or if it could have occurred by random chance. To do this, you must calculate the t-statistic, which measures how many standard errors the difference in sample means is away from zero under the null hypothesis.
The formula for the t-statistic in the case of two independent samples is:
t=n1s12+n2s22xˉ1−xˉ2where:
- xˉ1 and xˉ2 are the sample means for group 1 and group 2;
- s12 and s22 are the sample variances;
- n1 and n2 are the sample sizes.
The denominator combines the estimated variances from both groups, scaled by their respective sample sizes, to calculate the standard error of the difference in means. This formula assumes the two samples are independent and may have unequal variances.
To determine the significance of your t-statistic, you also need the degrees of freedom (df), which affect the shape of the t-distribution used to interpret your result. For two samples with possibly unequal variances, the Welch-Satterthwaite equation provides an approximation:
df=n1−1(s12/n1)2+n2−1(s22/n2)2(n1s12+n2s22)2This approach is robust even if the sample sizes or variances are not equal, which is common in practical A/B testing.
123456789101112131415161718192021222324252627282930313233import numpy as np # Sample data for two independent groups group1 = np.array([23, 21, 19, 24, 25, 22]) group2 = np.array([30, 28, 27, 31, 29, 32]) # Calculate sample means mean1 = np.mean(group1) mean2 = np.mean(group2) # Calculate sample variances (ddof=1 for sample variance) var1 = np.var(group1, ddof=1) var2 = np.var(group2, ddof=1) # Sample sizes n1 = len(group1) n2 = len(group2) # Calculate t-statistic se = np.sqrt(var1/n1 + var2/n2) t_statistic = (mean1 - mean2) / se # Calculate degrees of freedom using Welch-Satterthwaite equation numerator = (var1/n1 + var2/n2) ** 2 denominator = ((var1/n1)**2) / (n1 - 1) + ((var2/n2)**2) / (n2 - 1) df = numerator / denominator print(f"Sample mean 1: {mean1:.2f}") print(f"Sample mean 2: {mean2:.2f}") print(f"Sample variance 1: {var1:.2f}") print(f"Sample variance 2: {var2:.2f}") print(f"t-statistic: {t_statistic:.3f}") print(f"Degrees of freedom: {df:.2f}")
After calculating the t-statistic and degrees of freedom, interpret your results as follows:
- Compare your t-statistic to critical values from the t-distribution, or calculate a p-value;
- If the absolute value of your t-statistic is large (given the degrees of freedom), the observed difference in sample means is unlikely due to random chance, so you may reject the null hypothesis;
- If the t-statistic is small, the data does not provide strong evidence against the null hypothesis, and you cannot conclude the group means are different.
This process is essential for hypothesis testing and supports the reliability of conclusions in A/B testing.
Tak for dine kommentarer!
Spørg AI
Spørg AI
Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat
How do I calculate the p-value from the t-statistic and degrees of freedom?
Can you explain how to interpret the t-statistic and p-value in this context?
What are the assumptions behind using the t-test for A/B testing?
Awesome!
Completion rate improved to 3.23
t-test Math
Stryg for at vise menuen
Understanding the math behind the t-test is essential for applying it confidently in real-world A/B testing scenarios. The t-test helps you compare the means of two independent samples to determine if any observed difference is statistically significant, or if it could have occurred by random chance. To do this, you must calculate the t-statistic, which measures how many standard errors the difference in sample means is away from zero under the null hypothesis.
The formula for the t-statistic in the case of two independent samples is:
t=n1s12+n2s22xˉ1−xˉ2where:
- xˉ1 and xˉ2 are the sample means for group 1 and group 2;
- s12 and s22 are the sample variances;
- n1 and n2 are the sample sizes.
The denominator combines the estimated variances from both groups, scaled by their respective sample sizes, to calculate the standard error of the difference in means. This formula assumes the two samples are independent and may have unequal variances.
To determine the significance of your t-statistic, you also need the degrees of freedom (df), which affect the shape of the t-distribution used to interpret your result. For two samples with possibly unequal variances, the Welch-Satterthwaite equation provides an approximation:
df=n1−1(s12/n1)2+n2−1(s22/n2)2(n1s12+n2s22)2This approach is robust even if the sample sizes or variances are not equal, which is common in practical A/B testing.
123456789101112131415161718192021222324252627282930313233import numpy as np # Sample data for two independent groups group1 = np.array([23, 21, 19, 24, 25, 22]) group2 = np.array([30, 28, 27, 31, 29, 32]) # Calculate sample means mean1 = np.mean(group1) mean2 = np.mean(group2) # Calculate sample variances (ddof=1 for sample variance) var1 = np.var(group1, ddof=1) var2 = np.var(group2, ddof=1) # Sample sizes n1 = len(group1) n2 = len(group2) # Calculate t-statistic se = np.sqrt(var1/n1 + var2/n2) t_statistic = (mean1 - mean2) / se # Calculate degrees of freedom using Welch-Satterthwaite equation numerator = (var1/n1 + var2/n2) ** 2 denominator = ((var1/n1)**2) / (n1 - 1) + ((var2/n2)**2) / (n2 - 1) df = numerator / denominator print(f"Sample mean 1: {mean1:.2f}") print(f"Sample mean 2: {mean2:.2f}") print(f"Sample variance 1: {var1:.2f}") print(f"Sample variance 2: {var2:.2f}") print(f"t-statistic: {t_statistic:.3f}") print(f"Degrees of freedom: {df:.2f}")
After calculating the t-statistic and degrees of freedom, interpret your results as follows:
- Compare your t-statistic to critical values from the t-distribution, or calculate a p-value;
- If the absolute value of your t-statistic is large (given the degrees of freedom), the observed difference in sample means is unlikely due to random chance, so you may reject the null hypothesis;
- If the t-statistic is small, the data does not provide strong evidence against the null hypothesis, and you cannot conclude the group means are different.
This process is essential for hypothesis testing and supports the reliability of conclusions in A/B testing.
Tak for dine kommentarer!