Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
GROUP BY | Grouping
Intermediate SQL

GROUP BYGROUP BY

Prerequisites


Hi there! Welcome to the Intermediate SQL course!

In the first section, we're diving into how we can group and aggregate data within our tables.

Let's understand what "grouping data" means using a simple example of an employees table:

department employee_id first_name hire_date last_name salary
Engineering 1 John 2015-03-01T00:00:00Z Doe 80000.00
Engineering 2 Jane 2017-08-15T00:00:00Z Smith 90000.00
Marketing 3 Alice 2016-11-10T00:00:00Z Johnson 75000.00
Marketing 4 Bob 2018-06-25T00:00:00Z Brown 72000.00
... ... ... ... ... ...
Sales 10 James 2017-05-18T00:00:00Z Clark 68000.00

Now, let's imagine we have a task to "find out the number of employees in each department." To do this, we will group the data by the department column and use aggregation with the COUNT(*) function.

Here's what the implementation will look like:

So, as you can see, the syntax for grouping data looks like this:

Note

AGG_FUNC means aggregate functions like MAX, MIN, COUNT, etc.

This syntax exists to find certain values using aggregate functions in specific columns.

Let's consider another example: we've been tasked with finding the department with the highest average salary.

To retrieve such data, we need to group the data by the department column and then use the AVG() function to calculate the average salary:

Note

Please note that we will not be using this table in assignments; the employee table will be used solely for demonstrating syntax examples and their usage.

In this course, we will work with the Montreal Metro system database, which contains the metro_travel_time table.

This table will contain information about the station line(line_name), its name(station_name), and the amount of time it takes for a train to travel from one station to the next one(time_to_next_station).

Here is what this table looks like and the data preview in it:

id line_name station_name time_to_next_station
1 Green Angrignon 10
2 Green Monk 16
3 Green Verdun 9
4 Green Charlevoix 17
... ... ... ...
21 Yellow Longueuil 10

As you can see, this is not a complex table. Let's think about where we can use grouping here.

The most obvious option is grouping by the colors of metro lines. That means we can aggregate the data, grouping it by the color of the metro line.

Now, let's practice grouping by completing a task.

Tarefa

Your task is to find the longest time until the next station on each line. This will allow us to determine the longest travel time between stations for each metro line. To do this, use the MAX() function and alias it as max_time, grouping the data by the line_name column.

Once you've completed this task, click the button below the code to check your solution.

Tudo estava claro?

Seção 1. Capítulo 1
toggle bottom row
course content

Conteúdo do Curso

Intermediate SQL

GROUP BYGROUP BY

Prerequisites


Hi there! Welcome to the Intermediate SQL course!

In the first section, we're diving into how we can group and aggregate data within our tables.

Let's understand what "grouping data" means using a simple example of an employees table:

department employee_id first_name hire_date last_name salary
Engineering 1 John 2015-03-01T00:00:00Z Doe 80000.00
Engineering 2 Jane 2017-08-15T00:00:00Z Smith 90000.00
Marketing 3 Alice 2016-11-10T00:00:00Z Johnson 75000.00
Marketing 4 Bob 2018-06-25T00:00:00Z Brown 72000.00
... ... ... ... ... ...
Sales 10 James 2017-05-18T00:00:00Z Clark 68000.00

Now, let's imagine we have a task to "find out the number of employees in each department." To do this, we will group the data by the department column and use aggregation with the COUNT(*) function.

Here's what the implementation will look like:

So, as you can see, the syntax for grouping data looks like this:

Note

AGG_FUNC means aggregate functions like MAX, MIN, COUNT, etc.

This syntax exists to find certain values using aggregate functions in specific columns.

Let's consider another example: we've been tasked with finding the department with the highest average salary.

To retrieve such data, we need to group the data by the department column and then use the AVG() function to calculate the average salary:

Note

Please note that we will not be using this table in assignments; the employee table will be used solely for demonstrating syntax examples and their usage.

In this course, we will work with the Montreal Metro system database, which contains the metro_travel_time table.

This table will contain information about the station line(line_name), its name(station_name), and the amount of time it takes for a train to travel from one station to the next one(time_to_next_station).

Here is what this table looks like and the data preview in it:

id line_name station_name time_to_next_station
1 Green Angrignon 10
2 Green Monk 16
3 Green Verdun 9
4 Green Charlevoix 17
... ... ... ...
21 Yellow Longueuil 10

As you can see, this is not a complex table. Let's think about where we can use grouping here.

The most obvious option is grouping by the colors of metro lines. That means we can aggregate the data, grouping it by the color of the metro line.

Now, let's practice grouping by completing a task.

Tarefa

Your task is to find the longest time until the next station on each line. This will allow us to determine the longest travel time between stations for each metro line. To do this, use the MAX() function and alias it as max_time, grouping the data by the line_name column.

Once you've completed this task, click the button below the code to check your solution.

Tudo estava claro?

Seção 1. Capítulo 1
toggle bottom row
some-alt