Kursinnhold
A/B Testing in Python
A/B Testing in Python
What Criterion to Choose
Imagine we have received MORE clicks with the SECOND test variant of the site! So, we can set an updated version on our main page website, and that's all!
But wait a minute!
We have to perform more manipulations to prove whether there is a NON-RANDOM difference between groups.
To do that, we will:
- Look at the plots that represent group tests that show if there are any visual differences between groups;
- Look at the intervals of the distributions to see if they cover each other or not;
- Perform the hypothesis check with the statistical criterion.
A statistical criterion is a mathematical rule that allows us to reject the null hypothesis or not, that is, to conclude whether there is a non-random difference between groups. A statistical criterion creates a p-value.
To decide what criterion to choose while performing an A/B test, we need to use this scheme:
A statistical significance is a measure of confidence that a result is not random. By default, a statistical significance of 5%(or 1%) is used.
So look at the table:
Condition | Action |
---|---|
p-value > statistical significance | We can not decline the H0 hypothesis |
p-value < statistical significance | We can accept the H1 hypothesis |
To choose the right statistical criterion, we have to understand the distribution of our data. That is what we are going to do now!
If the
normaltest
shows the result <0.05
, it means that we receive the normal distribution. We need to decide what criterion to choose. To do that, let's define if our distribution is normal. Usescipy.stats.normaltest(data)
to perform this test.
Don't worry if the information sounds hard! We will cope with that!
Swipe to start coding
- Import the
pandas
with thepd
alias. - Import the
seaborn
with thesns
alias. - Import the
scipy
. - Import the
statsmodels.api
with thesm
alias. - Build the
distplot
using theclicks
column from thedf
. - Build the
qqplot
using theclicks
column from thedf
. - Perfrom the
normaltest
with theclicks
column from thedf
.
Løsning
Takk for tilbakemeldingene dine!
What Criterion to Choose
Imagine we have received MORE clicks with the SECOND test variant of the site! So, we can set an updated version on our main page website, and that's all!
But wait a minute!
We have to perform more manipulations to prove whether there is a NON-RANDOM difference between groups.
To do that, we will:
- Look at the plots that represent group tests that show if there are any visual differences between groups;
- Look at the intervals of the distributions to see if they cover each other or not;
- Perform the hypothesis check with the statistical criterion.
A statistical criterion is a mathematical rule that allows us to reject the null hypothesis or not, that is, to conclude whether there is a non-random difference between groups. A statistical criterion creates a p-value.
To decide what criterion to choose while performing an A/B test, we need to use this scheme:
A statistical significance is a measure of confidence that a result is not random. By default, a statistical significance of 5%(or 1%) is used.
So look at the table:
Condition | Action |
---|---|
p-value > statistical significance | We can not decline the H0 hypothesis |
p-value < statistical significance | We can accept the H1 hypothesis |
To choose the right statistical criterion, we have to understand the distribution of our data. That is what we are going to do now!
If the
normaltest
shows the result <0.05
, it means that we receive the normal distribution. We need to decide what criterion to choose. To do that, let's define if our distribution is normal. Usescipy.stats.normaltest(data)
to perform this test.
Don't worry if the information sounds hard! We will cope with that!
Swipe to start coding
- Import the
pandas
with thepd
alias. - Import the
seaborn
with thesns
alias. - Import the
scipy
. - Import the
statsmodels.api
with thesm
alias. - Build the
distplot
using theclicks
column from thedf
. - Build the
qqplot
using theclicks
column from thedf
. - Perfrom the
normaltest
with theclicks
column from thedf
.
Løsning
Takk for tilbakemeldingene dine!