Statistics with pandas
Swipe to show menu
The Pandas library already has three built-in functions for calculating the mean and median. To import pandas using the pd alias, use the following syntax:
import pandas as pd
Here's an example of calculating the .mean() and .median() for the 'work_year' column in the dataset named df.
Feel free to change the columns and observe the results:
1234567891011import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a849660e-ddfa-4033-80a6-94a1b7772e23/update/ds_salaries_statistics', index_col = 0) # Calculating the mean value mean = df['work_year'].mean() # Calculating the median value median = df['work_year'].median() print('The mean value is', mean) print('The median value is', median)
To calculate key statistical values, use methods designed for handling measurements, such as:
.mean()
.median()
Everything was clear?
Thanks for your feedback!
Sectionย 2. Chapterย 3
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
Sectionย 2. Chapterย 3