Learn Real-World Reporting with group_by() and summarise()

Swipe to show menu

When you build business reports or dashboards, you are often tasked with providing clear answers to questions like "Who are our top salespeople this quarter?" or "Which product categories are driving the most revenue?" Grouped summaries are the foundation behind these answers. By using group_by() and summarise() in R, you can quickly create tables that highlight key trends, comparisons, and performance metrics. These tables are essential for decision-makers who rely on up-to-date, accurate information to guide business strategies.


              1234567891011121314151617
            
library(dplyr)

# Sample sales data
sales_data <- data.frame(
  salesperson = c("Alice", "Bob", "Charlie", "Alice", "Bob", "Charlie", "Alice"),
  region = c("North", "South", "East", "North", "South", "East", "North"),
  sales = c(12000, 15000, 8000, 18000, 17000, 9000, 20000)
)

# Create a top-performers report: total sales by salesperson, sorted descending
top_performers <- sales_data %>%
  group_by(salesperson) %>%
  summarise(total_sales = sum(sales)) %>%
  arrange(desc(total_sales))

library(knitr)
kable(top_performers)

The workflow for building a report like the one above typically follows a few clear steps. First, you identify the key grouping variable—in this case, salesperson. Next, you use group_by() to organize the data by that variable. Then, summarise() computes the metric you care about, such as total sales. Finally, arrange() sorts the results so the highest performers are at the top, making the table easy to interpret and ready for sharing in a report or dashboard.


              12345678910
            
# Create a summary table for a dashboard: average sales by region
region_summary <- sales_data %>%
  group_by(region) %>%
  summarise(
    avg_sales = mean(sales),
    total_sales = sum(sales),
    num_transactions = n()
  )

kable(region_summary)

To interpret the summary table you just created, look at each region's average sales, total sales, and number of transactions. This tells you which regions are performing best overall, which might need attention, and where the largest volumes of sales are coming from. Such tables are often used in dashboards to give a quick snapshot of business health and to help prioritize actions.

Note

Once you have created a summary table in R, you can export it to CSV with write.csv() or to Excel using packages like writexl. This makes it easy to share your results with colleagues and stakeholders who may not use R.

1. How can `group_by()` and `summarise()` be used to create business reports?

2. What are some common metrics included in dashboards?

3. Why is it useful to arrange summary tables by performance?

Everything was clear?

Thanks for your feedback!

Section 3. Chapter 3

Ask AI

Ask anything or try one of the suggested questions to begin our chat

Section 3. Chapter 3

Real-World Reporting with group_by() and summarise()

1. How can group_by() and summarise() be used to create business reports?

2. What are some common metrics included in dashboards?

3. Why is it useful to arrange summary tables by performance?

1. How can `group_by()` and `summarise()` be used to create business reports?