Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Вивчайте Cohort Analysis | Analytical SQL Techniques
SQL for Analytical Reports
Секція 1. Розділ 5
single

single

bookCohort Analysis

Свайпніть щоб показати меню

Cohort analysis is a powerful analytical technique used to understand user retention and behavioral trends by grouping users into cohorts—groups of users who share a common characteristic, typically their signup date. By tracking the activity and retention of each cohort over time, you gain insights into the long-term value of your product and the effectiveness of user engagement strategies. The process involves cohort assignment, where users are grouped by their signup period (such as month or week), and retention calculation, where you measure how many users from each cohort remain active or return in subsequent periods.

Note
Definition

A cohort is a group of users who share a common event within a defined time span, such as signing up in the same month.
Business Value:
Cohort analysis reveals how different user groups behave over time, enabling you to identify trends in retention, the impact of product changes, and opportunities for growth or intervention.

To perform cohort analysis in SQL, you first need to assign users to their respective cohorts based on their signup date. For example, you can group users by the month they signed up and then track their activity in later months to measure retention.

12345678
-- Assign users to monthly cohorts and show their signup month SELECT user_id, username, signup_date, DATE_TRUNC('month', signup_date) AS signup_month FROM users ORDER BY signup_date;
copy

After users are assigned to cohorts, you can calculate retention by checking which users from each cohort return or remain active in the following months. This approach allows you to build a retention matrix, where each cell shows the number of users from a given cohort who are still active after a certain number of months.

Suppose a SaaS product manager wants to see retention rates for users who signed up each month. They need to know, for each signup month, what percentage of users are still active in subsequent months. This helps to reveal how well the product retains users over time and whether any onboarding or product changes have improved retention for newer cohorts.

123456789101112131415161718192021222324252627
-- Calculate monthly retention by cohort for the SaaS platform WITH signup_cohorts AS ( SELECT user_id, DATE_TRUNC('month', signup_date) AS cohort_month FROM users ), activity_months AS ( SELECT user_id, DATE_TRUNC('month', signup_date) AS cohort_month, DATE_TRUNC('month', CURRENT_DATE) AS activity_month, is_active FROM users ) SELECT s.cohort_month, a.activity_month, COUNT(DISTINCT a.user_id) AS active_users, COUNT(DISTINCT s.user_id) AS cohort_size, ROUND(100.0 * COUNT(DISTINCT a.user_id) / COUNT(DISTINCT s.user_id), 2) AS retention_rate FROM signup_cohorts s LEFT JOIN activity_months a ON s.user_id = a.user_id AND a.is_active = TRUE GROUP BY s.cohort_month, a.activity_month ORDER BY s.cohort_month, a.activity_month;
copy

In this scenario, you can see how cohort analysis supports business decisions. It allows the SaaS product manager to spot patterns, such as whether users who joined in March are more likely to remain active after three months compared to those who joined in February.

Now, apply your cohort analysis skills to a practical problem. Imagine you are asked to write a query that shows, for each signup month, the percentage of users who are still active exactly three months after their signup month. This measure is a key indicator of medium-term retention and is often used to benchmark user engagement for subscription-based products.

Завдання

Проведіть, щоб почати кодувати

Write a SQL query that, for each signup month, shows the percentage of users who are still active exactly three months after their signup month.

  • Group users by the month of their signup date.
  • For each cohort, determine the number of users whose is_active status is TRUE and for whom at least three full months have passed since their cohort month.
  • Calculate the retention rate as the percentage of the cohort who are still active after three months.
  • Return the cohort month, cohort size, number retained after three months, and the retention rate rounded to two decimal places.

Рішення

Switch to desktopПерейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 1. Розділ 5
single

single

Запитати АІ

expand

Запитати АІ

ChatGPT

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

some-alt