Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Oppiskele Grouping by Several Columns | Grouping Data
Data Manipulation using pandas

bookGrouping by Several Columns

Is it possible to group by pairs of values? For instance, we can group by countries and then by their regions. Yes, it's also possible in pandas! To group by several columns, use the same .groupby() method passing list of columns that will be used to determine groups. How does such a grouping work? Look at the picture below.

As you can see, at first values were grouped by 'Group' and then by 'Subgroup' among each of groups. For instance, let's find out number of households for each pair of 'roomh', 'hhsize' columns values (number of rooms and number of people in a dwelling, respectively).

12345678
# Importing the library import pandas as pd # Reading the file df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f2947b09-5f0d-4ad9-992f-ec0b87cd4b3f/data4.csv') # Grouping and aggregating data print(df.groupby(['roomh', 'hhsize']).size())
copy

The output is quite big, since number of possible combinations is quite large. For instance, you can see that there are 59 dwellings with 10 or more rooms with 4 people living in it.

Oliko kaikki selvää?

Miten voimme parantaa sitä?

Kiitos palautteestasi!

Osio 3. Luku 6

Kysy tekoälyä

expand

Kysy tekoälyä

ChatGPT

Kysy mitä tahansa tai kokeile jotakin ehdotetuista kysymyksistä aloittaaksesi keskustelumme

Awesome!

Completion rate improved to 2.56

bookGrouping by Several Columns

Pyyhkäise näyttääksesi valikon

Is it possible to group by pairs of values? For instance, we can group by countries and then by their regions. Yes, it's also possible in pandas! To group by several columns, use the same .groupby() method passing list of columns that will be used to determine groups. How does such a grouping work? Look at the picture below.

As you can see, at first values were grouped by 'Group' and then by 'Subgroup' among each of groups. For instance, let's find out number of households for each pair of 'roomh', 'hhsize' columns values (number of rooms and number of people in a dwelling, respectively).

12345678
# Importing the library import pandas as pd # Reading the file df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f2947b09-5f0d-4ad9-992f-ec0b87cd4b3f/data4.csv') # Grouping and aggregating data print(df.groupby(['roomh', 'hhsize']).size())
copy

The output is quite big, since number of possible combinations is quite large. For instance, you can see that there are 59 dwellings with 10 or more rooms with 4 people living in it.

Oliko kaikki selvää?

Miten voimme parantaa sitä?

Kiitos palautteestasi!

Osio 3. Luku 6
some-alt