The Third T-Test
Let's review the plots for the 'Purchase'
column of the control and test groups.
Levene's Test:
99
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Import libraries
import pandas as pd
from scipy.stats import levene
# Read .csv files
df_control = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/c3b98ad3-420d-403f-908d-6ab8facc3e28/ab_control.csv', delimiter=';')
df_test = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/c3b98ad3-420d-403f-908d-6ab8facc3e28/ab_test.csv', delimiter=';')
# Do Levene's test
statistic, p_value = levene(df_control['Purchase'], df_test['Purchase'])
# Print result of Levene's test
print('Statistic:', statistic)
print('p-value:', p_value)
# Determine whether the variances are similar
if p_value > 0.05:
print('The variances of the two groups are NOT statistically different')
else:
print('The variances of the two groups are statistically different')
1234567891011121314151617181920# Import libraries import pandas as pd from scipy.stats import levene # Read .csv files df_control = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/c3b98ad3-420d-403f-908d-6ab8facc3e28/ab_control.csv', delimiter=';') df_test = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/c3b98ad3-420d-403f-908d-6ab8facc3e28/ab_test.csv', delimiter=';') # Do Levene's test statistic, p_value = levene(df_control['Purchase'], df_test['Purchase']) # Print result of Levene's test print('Statistic:', statistic) print('p-value:', p_value) # Determine whether the variances are similar if p_value > 0.05: print('The variances of the two groups are NOT statistically different') else: print('The variances of the two groups are statistically different')
Now perform a t-test for the 'Purchase'
columns:
H₀: The mean values of the column do not differ between the groups.
Hₐ: The mean values of the column differ between the groups.
99
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Import libraries
import pandas as pd
from scipy.stats import ttest_ind
# Read .csv files
df_control = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/c3b98ad3-420d-403f-908d-6ab8facc3e28/ab_control.csv', delimiter=';')
df_test = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/c3b98ad3-420d-403f-908d-6ab8facc3e28/ab_test.csv', delimiter=';')
# Select only the 'Purchase' columns
data_control = df_control['Purchase']
data_test = df_test['Purchase']
# Do T-Test
statistic, p_value = ttest_ind(data_control, data_test, equal_var=True)
# Print result of T-test
print('Statistic:', statistic)
print('p-value:', p_value)
123456789101112131415161718# Import libraries import pandas as pd from scipy.stats import ttest_ind # Read .csv files df_control = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/c3b98ad3-420d-403f-908d-6ab8facc3e28/ab_control.csv', delimiter=';') df_test = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/c3b98ad3-420d-403f-908d-6ab8facc3e28/ab_test.csv', delimiter=';') # Select only the 'Purchase' columns data_control = df_control['Purchase'] data_test = df_test['Purchase'] # Do T-Test statistic, p_value = ttest_ind(data_control, data_test, equal_var=True) # Print result of T-test print('Statistic:', statistic) print('p-value:', p_value)
In this case, the p-value (0.350
) is higher than the acceptable significance level (0.05
), indicating that there is insufficient evidence to suggest that the mean values differ between the groups. Now it's your turn!
Var alt klart?
Tak for dine kommentarer!
Sektion 4. Kapitel 4
Spørg AI
Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat