Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer Cohort Analysis | Analytical SQL Techniques
SQL for Analytical Reports
Sectie 1. Hoofdstuk 5
single

single

bookCohort Analysis

Veeg om het menu te tonen

Cohort analysis is a powerful analytical technique used to understand user retention and behavioral trends by grouping users into cohorts—groups of users who share a common characteristic, typically their signup date. By tracking the activity and retention of each cohort over time, you gain insights into the long-term value of your product and the effectiveness of user engagement strategies. The process involves cohort assignment, where users are grouped by their signup period (such as month or week), and retention calculation, where you measure how many users from each cohort remain active or return in subsequent periods.

Note
Definition

A cohort is a group of users who share a common event within a defined time span, such as signing up in the same month.
Business Value:
Cohort analysis reveals how different user groups behave over time, enabling you to identify trends in retention, the impact of product changes, and opportunities for growth or intervention.

To perform cohort analysis in SQL, you first need to assign users to their respective cohorts based on their signup date. For example, you can group users by the month they signed up and then track their activity in later months to measure retention.

12345678
-- Assign users to monthly cohorts and show their signup month SELECT user_id, username, signup_date, DATE_TRUNC('month', signup_date) AS signup_month FROM users ORDER BY signup_date;
copy

After users are assigned to cohorts, you can calculate retention by checking which users from each cohort return or remain active in the following months. This approach allows you to build a retention matrix, where each cell shows the number of users from a given cohort who are still active after a certain number of months.

Suppose a SaaS product manager wants to see retention rates for users who signed up each month. They need to know, for each signup month, what percentage of users are still active in subsequent months. This helps to reveal how well the product retains users over time and whether any onboarding or product changes have improved retention for newer cohorts.

123456789101112131415161718192021222324252627
-- Calculate monthly retention by cohort for the SaaS platform WITH signup_cohorts AS ( SELECT user_id, DATE_TRUNC('month', signup_date) AS cohort_month FROM users ), activity_months AS ( SELECT user_id, DATE_TRUNC('month', signup_date) AS cohort_month, DATE_TRUNC('month', CURRENT_DATE) AS activity_month, is_active FROM users ) SELECT s.cohort_month, a.activity_month, COUNT(DISTINCT a.user_id) AS active_users, COUNT(DISTINCT s.user_id) AS cohort_size, ROUND(100.0 * COUNT(DISTINCT a.user_id) / COUNT(DISTINCT s.user_id), 2) AS retention_rate FROM signup_cohorts s LEFT JOIN activity_months a ON s.user_id = a.user_id AND a.is_active = TRUE GROUP BY s.cohort_month, a.activity_month ORDER BY s.cohort_month, a.activity_month;
copy

In this scenario, you can see how cohort analysis supports business decisions. It allows the SaaS product manager to spot patterns, such as whether users who joined in March are more likely to remain active after three months compared to those who joined in February.

Now, apply your cohort analysis skills to a practical problem. Imagine you are asked to write a query that shows, for each signup month, the percentage of users who are still active exactly three months after their signup month. This measure is a key indicator of medium-term retention and is often used to benchmark user engagement for subscription-based products.

Taak

Veeg om te beginnen met coderen

Write a SQL query that, for each signup month, shows the percentage of users who are still active exactly three months after their signup month.

  • Group users by the month of their signup date.
  • For each cohort, determine the number of users whose is_active status is TRUE and for whom at least three full months have passed since their cohort month.
  • Calculate the retention rate as the percentage of the cohort who are still active after three months.
  • Return the cohort month, cohort size, number retained after three months, and the retention rate rounded to two decimal places.

Oplossing

Switch to desktopSchakel over naar desktop voor praktijkervaringGa verder vanaf waar je bent met een van de onderstaande opties
Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 1. Hoofdstuk 5
single

single

Vraag AI

expand

Vraag AI

ChatGPT

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

some-alt