Challenge: Fourth T-Test
Compito
Swipe to start coding
Your task is to do a t-test. Recall that the distribution in the 'Earning' column has a normal distribution in both datasets. Also, there is no statistically significant difference between the variances. Are the means of the two samples equal?
- Calculate the mean values.
- Perform a t-test.
Soluzione
99
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Import libraries
import pandas as pd
from scipy.stats import ttest_ind
# Read .csv files
df_control = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/c3b98ad3-420d-403f-908d-6ab8facc3e28/ab_control.csv', delimiter=';')
df_test = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/c3b98ad3-420d-403f-908d-6ab8facc3e28/ab_test.csv', delimiter=';')
# Select only the 'Earning' columns
data_control = df_control['Earning']
data_test = df_test['Earning']
# Calculate the mean values
print('The mean of control group = ', data_control.mean())
print('The mean of test group = ', data_test.mean())
# Do T-Test
statistic, p_value = ttest_ind(data_control, data_test, equal_var=True)
# Print the result of T-Test
print('Statistic:', statistic)
print('p-value:', p_value)
# Determine whether there is a statistically significant difference between the means of the two samples
if p_value > 0.05:
print('The means of the two groups are NOT statistically different')
else:
print('The means of the two groups are statistically different')
Tutto è chiaro?
Grazie per i tuoi commenti!
Sezione 4. Capitolo 5
single
99
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Import libraries
import pandas as pd
from scipy.stats import ttest_ind
# Read .csv files
df_control = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/c3b98ad3-420d-403f-908d-6ab8facc3e28/ab_control.csv', delimiter=';')
df_test = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/c3b98ad3-420d-403f-908d-6ab8facc3e28/ab_test.csv', delimiter=';')
# Select only the 'Earning' columns
data_control = df_control['Earning']
data_test = df_test['Earning']
# Calculate the mean values
print('The mean of control group = ', data_control.___)
print('The mean of test group = ', data_test.___)
# Do T-Test
statistic, p_value = ___(___, ___, equal_var=___)
# Print the result of T-Test
print('Statistic:', statistic)
print('p-value:', p_value)
# Determine whether there is a statistically significant difference between the means of the two samples
if p_value > 0.05:
print('The means of the two groups are NOT statistically different')
else:
print('The means of the two groups are statistically different')
Chieda ad AI
Chieda ad AI
Chieda pure quello che desidera o provi una delle domande suggerite per iniziare la nostra conversazione