Pivot Tables
It's time to deal with a similar function called .pivot_table()
. Indeed, it is very similar to .groupby()
, but the syntax is different. Here using agg
functions is obligatory. If you remember, several chapters ago, we were working with this dataset:
And this example:
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/INTRO+to+Python/ds_salaries.csv', index_col = 0) df = df[['salary','job_title', 'experience_level']].groupby(['job_title', 'experience_level']).mean() print(df)
Look at the result:
Let's practice, look at the implimentation using .pivot_table()
to get the same result:
import pandas as pd import numpy as np df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/INTRO+to+Python/ds_salaries.csv', index_col = 0) df = pd.pivot_table(df, index = ['plan','trial'], values = ['price'], aggfunc = [np.mean]) print(df)
You should put the dataset as the first argument.
Put columns on which you want to group the data to the array
index
; the order is crucial, like in.groupby()
.Put columns you want to group to the array
values
(to calculate mean, median, etc.). The order is not crucial. Indeed, this argument is not obligatory; otherwise,aggfunc
will be applied for all numerical columns depending on their group.Put NumPy functions that you want to apply to grouped columns to the array
aggfunc
(to calculate mean, median, etc.); the order is not crucial. One of those that we learned. But use them without brackets and arguments, just the function's title likenp.mean()
ornp.sum()
.
Swipe to start coding
Your task is to create a pivot table where you will group by plan and count mean and median price. Check, if they vary. Follow the algorithm:
- Create a pivot table with the arguments:
df
as the first argument.'plan'
to theindex
as the second argument.'price'
to thevalues
as the second argument.np.mean
andnp.median
to theaggfunc
as the third argument.
- Print the
df
.
By the way, if they vary significantly, you have outliers (incredibly small or big values).
Solution
Merci pour vos commentaires !