Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
A/A Test | What is A/B testing?
The Art of A/B Testing
course content

Зміст курсу

The Art of A/B Testing

The Art of A/B Testing

1. What is A/B testing?
2. Normality Check
3. Variances in A/B Testing
4. T-Test
5. U-Test

bookA/A Test

Preparation for the Experiment

Before conducting a controlled experiment, we must be convinced that the data in the test group have been collected correctly. We must consider several factors:

  • Day of the week effect. Groups may differ on weekends and weekdays. People behave differently on different days of the week. Therefore, the data will be collected within a week;
  • Seasonality. During the holidays, users shop more actively, which can give false ideas about real sales. Therefore, the data are collected in the season without holidays;
  • Growing number of users over time. More and more people are involved in the experiment over time. So, we conducted an online experiment for three groups of users. Each group was tested for a full week. An equal number of users took part in each experiment. Our experiment took place in the off-season (there were no holidays that would have provoked an increase in sales).

The metric of the success of the experiment is the conversion rate. It is time to check the adequacy of our results.

Let's get acquainted with the data. Both datasets have one hundred records and three columns. The first column 'Male' is binary. If the value is equal to 1 - the user is male. If the value is equal to 0 - the user is female. The second column 'Page View' characterizes the number of page views. The third column 'Purchase' corresponds to the number of purchases. Let's see what these tables look like:

12345678910
# Import libraries import pandas as pd from scipy.stats import mannwhitneyu # Read .csv files control_group_1 = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/updated_first.csv') control_group_2 = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/updated_second.csv') # Show head of the first file print(control_group_1)
copy

The second file is similar to the first file. But can we say that there is no statistically significant difference between them?

We must make sure that no factors influenced our experiment. In other words, the average metric values ​​of control group 1 and control group 2 must be the same.

Let's formulate hypotheses:

H₀: There is no statistically significant difference between the means of the two samples.

Hₐ: There is a statistically significant difference between the means of the two samples.

Our first test:

123456789101112131415161718192021
# Import libraries import pandas as pd from scipy.stats import mannwhitneyu # Read .csv files control_group_1 = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/updated_first.csv') control_group_2 = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/updated_second.csv') # Define metric control_group_1['Conversion'] = (control_group_1['Purchase'] / control_group_1['Page view']).round(2) control_group_2['Conversion'] = (control_group_2['Purchase'] / control_group_2['Page view']).round(2) # Do U-Test stat, p = mannwhitneyu(control_group_1['Conversion'], control_group_2['Conversion']) # Identify the test result print('stat=%.3f, p=%.3f' % (stat, p)) if p > 0.05: print('There is no statistically significant difference between the medians of the two samples') else: print('There is a statistically significant difference between the medians of the two samples')
copy

Since p > 0.05, we cannot reject the null hypothesis that the two means are equal.

Looks easy, right?

In this code, we created two groups of data, control_group_1 and control_group_2, performed a u-test using the mannwhitney function, and displayed the test results on the screen.

Why was this particular test chosen? We'll talk about this in the next chapters.

What are the hypotheses formulated for checking the adequacy of the results?

What are the hypotheses formulated for checking the adequacy of the results?

Виберіть правильну відповідь

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 1. Розділ 4
We're sorry to hear that something went wrong. What happened?
some-alt