Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Group Data | Explore Dataset
Introduction to Python for Data Analysis
course content

Contenido del Curso

Introduction to Python for Data Analysis

Introduction to Python for Data Analysis

1. Introduction to Python 1/2
2. Introduction to Python 2/2
3. Explore Dataset
4. Becoming an Analyst

bookGroup Data

It is time to move to more complicated functions. The first one is .groupby()! It can be guessed from the title that this function groups our columns, but how? Firstly, look at the example, and everything will become more evident:

Here is the initial dataset:

Look at the job titles.

Imagine you want to know the mean salary for each specialization. However, it's impossible to calculate this value manually; you have plenty of data, so using a function that can group column values is the right way.

Here is the code:

Look at the output:

job_titlework_yearsalarysalary_in_usd
3D Computer Vision Researcher2021.000000400000.0000005409.000000
AI Scientist2021.142857290571.42857166135.571429
Analytics Engineer2022.000000175000.000000175000.000000
Applied Data Scientist2021.600000172400.000000175655.000000
Applied Machine Learning Scientist2021.500000141350.000000142068.750000

Here, we can see the mean value of work_year, salary, and salary_in_usd for each job_title.

How does the function work?

  1. DataFrame.groupby() is general syntax of groupby function.
  2. DataFrame.groupby(['job_title']) in brackets we specify by which column we will group data.
  3. DataFrame.groupby(['job_title']).mean() obligatory we need to give the program instructions: what it has to do with numerical values that relate to one group. In our case, we ask the groupby function to calculate the mean() of all numerical values with one job_title.
question-icon

Calculate average salary for each year. To do it, fill the gaps to group the data using 'work_year' column.

df = ..mean()

Click or drag`n`drop items and fill in the blanks

¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 3. Capítulo 7
some-alt