Kursinnehåll
Learning Statistics with Python
Learning Statistics with Python
Performing a t-test in Python
To conduct a t-test in Python, all you have to do is specify the alternative hypothesis and indicate whether variances are roughly equal (homogeneous).
The ttest_ind()
function within scipy.stats
handles the rest. Below is the syntax:
python
Parameters:
a
— the first sample;b
— the second sample;equal_var
— set toTrue
if variances are approximately equal, andFalse
if they are not;alternative
— the type of alternative hypothesis:'two-sided'
— indicates that the means are not equal;'less'
— implies that the first mean is less than the second;'greater'
— implies that the first mean is greater than the second.
Return values:
statistic
— the value of the t statistic;pvalue
— the p-value.
The focus is on the p-value
. If the p-value
is lower than α (usually 0.05), the t statistic falls within the critical region, leading to the acceptance of the alternative hypothesis. If the p-value
is greater than α, the null hypothesis is accepted, indicating that the means are equal.
Here is an example of applying the t-test to the heights dataset:
import pandas as pd import scipy.stats as st # Load the data male = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a849660e-ddfa-4033-80a6-94a1b7772e23/Testing2.0/male.csv').squeeze() female = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a849660e-ddfa-4033-80a6-94a1b7772e23/Testing2.0/female.csv').squeeze() # Apply t-test t_stat, pvalue = st.ttest_ind(male, female, equal_var=True, alternative="greater") if pvalue > 0.05: # Check if we should support or not the null hypothesis if pvalue > 0.05: print("We support the null hypothesis, the mean values are equal") else: print("We reject the null hypothesis, males are taller")
Tack för dina kommentarer!