Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Conduct a t-test | Statistical Testing
Learning Statistics with Python

bookConduct a t-test

A company wants to determine if there is a significant difference in the productivity levels of developers who work from home versus those who work in the office. Good thing you already know a t-test can help with it.

The company has two independent developer teams: one works remotely, and the other works from the office. You've been provided with two files, 'work_from_home.csv' and 'work_from_office.csv', which contain the monthly task completion counts for each developer.

Task is to conduct a t-test. The company wants to know whether developers who work from the office are more productive than home workers. If so, they will also force the second team to work from the office. In case of home workers being more productive, the company will not make any changes. So the desired alternative hypothesis is "The mean productivity of office workers is greater than that of home workers".

Check if the variances are the same:

1234567
import pandas as pd home_workers = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a849660e-ddfa-4033-80a6-94a1b7772e23/Testing2.0/work_from_home.csv').squeeze() office_workers = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a849660e-ddfa-4033-80a6-94a1b7772e23/Testing2.0/work_from_office.csv').squeeze() # Printing sample standard deviations print('Home workers std:', home_workers.std()) print('Office workers std:', office_workers.std())
copy

The second standard deviation is twice as much as the first, so variances differ. Recall the function ttest_ind to perform a t-test.

st.ttest_ind(a, b, equal_var=True, alternative='two-sided')
Task

Swipe to start coding

You are comparing the productivity of employees working from home and from the office. Your goal is to determine whether office workers have a greater mean productivity than home workers using a t-test for independent samples.

  1. Import the scipy.stats library with the alias st.
  2. Use the st.ttest_ind() function to conduct the t-test with the following setup:
    • Samples: office_workers, home_workers.
    • Alternative hypothesis: office > home.
    • Variances are not equal (equal_var=False).
  3. Store the results in the variables tstat and pvalue.
  4. Based on the pvalue, print one of the following messages:
    • "We support the null hypothesis, the mean values are equal" if pvalue > 0.05.
    • "We reject the null hypothesis, the mean values are different" otherwise.

Solution

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 6. ChapterΒ 7
single

single

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you show me how to perform the t-test with unequal variances?

What should the alternative hypothesis parameter be set to for this scenario?

What does the result of the t-test mean for the company's decision?

close

Awesome!

Completion rate improved to 2.63

bookConduct a t-test

Swipe to show menu

A company wants to determine if there is a significant difference in the productivity levels of developers who work from home versus those who work in the office. Good thing you already know a t-test can help with it.

The company has two independent developer teams: one works remotely, and the other works from the office. You've been provided with two files, 'work_from_home.csv' and 'work_from_office.csv', which contain the monthly task completion counts for each developer.

Task is to conduct a t-test. The company wants to know whether developers who work from the office are more productive than home workers. If so, they will also force the second team to work from the office. In case of home workers being more productive, the company will not make any changes. So the desired alternative hypothesis is "The mean productivity of office workers is greater than that of home workers".

Check if the variances are the same:

1234567
import pandas as pd home_workers = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a849660e-ddfa-4033-80a6-94a1b7772e23/Testing2.0/work_from_home.csv').squeeze() office_workers = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a849660e-ddfa-4033-80a6-94a1b7772e23/Testing2.0/work_from_office.csv').squeeze() # Printing sample standard deviations print('Home workers std:', home_workers.std()) print('Office workers std:', office_workers.std())
copy

The second standard deviation is twice as much as the first, so variances differ. Recall the function ttest_ind to perform a t-test.

st.ttest_ind(a, b, equal_var=True, alternative='two-sided')
Task

Swipe to start coding

You are comparing the productivity of employees working from home and from the office. Your goal is to determine whether office workers have a greater mean productivity than home workers using a t-test for independent samples.

  1. Import the scipy.stats library with the alias st.
  2. Use the st.ttest_ind() function to conduct the t-test with the following setup:
    • Samples: office_workers, home_workers.
    • Alternative hypothesis: office > home.
    • Variances are not equal (equal_var=False).
  3. Store the results in the variables tstat and pvalue.
  4. Based on the pvalue, print one of the following messages:
    • "We support the null hypothesis, the mean values are equal" if pvalue > 0.05.
    • "We reject the null hypothesis, the mean values are different" otherwise.

Solution

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 6. ChapterΒ 7
single

single

some-alt