Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Statistical Significance | Experimentation and A/B Testing
Product Analytics for Beginners

Statistical Significance

Swipe to show menu

When you run an A/B test, you want to know if the difference you see between the control and variant groups is real or just a result of random chance. Think about flipping a coin: if you flip it ten times and get seven heads, does that mean the coin is unfair? Or was it just luck? In product analytics, this is where statistical significance comes in. It helps you decide if the difference in outcomes - like more users clicking a new button - is likely to be meaningful, or if it could have happened just by accident, like a streak of heads in coin flips.

1234567891011121314151617
import numpy as np from scipy import stats # Simulated data: daily conversions for control and variant groups control = np.array([30, 28, 35, 33, 29, 31, 32]) variant = np.array([36, 34, 39, 37, 35, 38, 40]) # Performing independent t-test t_stat, p_value = stats.ttest_ind(variant, control) print("t-statistic:", t_stat) print("p-value:", p_value) if p_value < 0.05: print("Result is statistically significant: the variant performed differently from control.") else: print("Result is not statistically significant: no strong evidence of a difference.")
Note
Definition

Statistical significance indicates that observed differences are unlikely due to random chance.

When you get a p-value from your statistical test, it tells you how likely it is to see a difference as large - or larger - than the one you observed, just by chance. A low p-value (for example, below 0.05) means it's unlikely the results happened randomly, so you can be more confident that your change made a real impact. If the p-value is high, you can't rule out that the difference was just luck. This helps you make product decisions with confidence: launch new features when the evidence is strong, and avoid acting on results that might not hold up.

The significance level, often shown as α (alpha), is a threshold you set before running your test to decide how much risk of a false positive (Type I error) you are willing to accept. In A/B testing, it represents the probability of incorrectly concluding that a real difference exists when, in fact, the difference is just due to random chance.

  • The most common significance level is 0.05, or 5%;
  • This means you accept a 5% chance of wrongly declaring a difference when there is none;
  • Lowering the significance level (for example, to 0.01) makes your test more strict, reducing the risk of a false positive but requiring stronger evidence to declare significance;
  • The significance level is set before you collect or analyze your data.

In practice, if your p-value is less than your chosen significance level, you consider the result statistically significant and more likely to reflect a real effect. If the p-value is higher, you do not have enough evidence to confidently say there is a true difference. Setting the right significance level helps you balance the risks of making wrong decisions in your product experiments.

1. What does a low p-value indicate in hypothesis testing?

2. Fill in the blank:

question mark

What does a low p-value indicate in hypothesis testing?

Select the correct answer

question-icon

Fill in the blank:

A result is considered statistically significant if the p-value is less than .

Click or drag`n`drop items and fill in the blanks

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 4. Chapter 3

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Section 4. Chapter 3
some-alt