Statistics with pandas
The Pandas library already has three built-in functions for calculating the mean and median. To import pandas using the pd alias, use the following syntax:
import pandas as pd
Here's an example of calculating the .mean() and .median() for the 'work_year' column in the dataset named df.
Feel free to change the columns and observe the results:
1234567891011import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a849660e-ddfa-4033-80a6-94a1b7772e23/update/ds_salaries_statistics', index_col = 0) # Calculating the mean value mean = df['work_year'].mean() # Calculating the median value median = df['work_year'].median() print('The mean value is', mean) print('The median value is', median)
To calculate key statistical values, use methods designed for handling measurements, such as:
.mean()
.median()
Thanks for your feedback!
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
Can you explain the difference between mean and median?
How can I calculate the mean and median for a different column in the dataset?
What other statistical functions are available in pandas?
Awesome!
Completion rate improved to 2.63
Statistics with pandas
Swipe to show menu
The Pandas library already has three built-in functions for calculating the mean and median. To import pandas using the pd alias, use the following syntax:
import pandas as pd
Here's an example of calculating the .mean() and .median() for the 'work_year' column in the dataset named df.
Feel free to change the columns and observe the results:
1234567891011import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a849660e-ddfa-4033-80a6-94a1b7772e23/update/ds_salaries_statistics', index_col = 0) # Calculating the mean value mean = df['work_year'].mean() # Calculating the median value median = df['work_year'].median() print('The mean value is', mean) print('The median value is', median)
To calculate key statistical values, use methods designed for handling measurements, such as:
.mean()
.median()
Thanks for your feedback!