Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Вивчайте Cohort Retention & Decay Analysis | Cohort Analysis
Quizzes & Challenges
Quizzes
Challenges
/
Business Analytics and Decision Making with Python

bookCohort Retention & Decay Analysis

Understanding how long customers remain engaged and when they tend to drop off is crucial for any business looking to improve customer lifetime value and reduce churn. Retention analysis measures the proportion of users from a given cohort—such as those who made their first purchase in a particular month—who return or remain active in subsequent periods. By structuring users into cohorts based on their initial interaction (as you did in the previous chapter), you can track retention rates over time for each group. This enables you to see how quickly engagement decays and to identify patterns in customer loyalty, which is vital for SaaS, e-commerce, and subscription businesses aiming to optimize customer journeys and maximize recurring revenue.

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465
import pandas as pd import numpy as np # Fixed realistic transaction data data = { "user_id": [ # Cohort 2023-01 users 1, 2, 3, 4, 5, 1, 2, 3, 1, 3, # Cohort 2023-02 users 6, 7, 8, 9, 6, 7, 6, # Cohort 2023-03 users 10, 11, 12, 10, 12 ], "order_month": [ # Cohort 1 purchases "2023-01", "2023-01", "2023-01", "2023-01", "2023-01", "2023-02", "2023-02", "2023-02", "2023-03", "2023-03", # Cohort 2 purchases "2023-02", "2023-02", "2023-02", "2023-02", "2023-03", "2023-03", "2023-04", # Cohort 3 purchases "2023-03", "2023-03", "2023-03", # FIXED this line "2023-04", "2023-04" ] } df = pd.DataFrame(data) # Assign cohort_month as the first purchase month df["cohort_month"] = df.groupby("user_id")["order_month"].transform("min") # Convert to datetime df["order_month"] = pd.to_datetime(df["order_month"]) df["cohort_month"] = pd.to_datetime(df["cohort_month"]) # cohort index df["cohort_index"] = ( (df["order_month"].dt.year - df["cohort_month"].dt.year) * 12 + (df["order_month"].dt.month - df["cohort_month"].dt.month) + 1 ) # pivot table cohort_counts = ( df.groupby(["cohort_month", "cohort_index"])["user_id"] .nunique() .reset_index() .pivot(index="cohort_month", columns="cohort_index", values="user_id") ) # retention matrix cohort_sizes = cohort_counts.iloc[:, 0] retention = cohort_counts.divide(cohort_sizes, axis=0).round(3) print(retention)
copy
1234567891011121314151617181920
import seaborn as sns import matplotlib.pyplot as plt # Assume 'retention' DataFrame is already computed as above plt.figure(figsize=(8, 5)) sns.heatmap( retention, annot=True, fmt=".0%", cmap="Blues", cbar=True, linewidths=0.5, linecolor="gray" ) plt.title("Cohort Retention Rates by Month") plt.ylabel("Cohort Month (First Purchase)") plt.xlabel("Months Since First Purchase") plt.yticks(rotation=0) plt.show()
copy

When you examine the retention heatmap, you will notice that retention rates typically start at 100% for each cohort's first month, since all users are present at the beginning. Over time, these rates usually decline as some customers become inactive or churn. The rate and pattern of this decay reveal important business insights:

  • A SaaS company may see a gradual drop each month, indicating steady but manageable churn;
  • An e-commerce business might display sharp drops after the first or second month, reflecting a reliance on one-time buyers.

High retention in later months signals strong product-market fit and customer loyalty, while steep early declines may highlight issues with onboarding, value delivery, or product satisfaction. Recognizing these patterns allows you to target improvements where they will have the most impact on long-term revenue and growth.

question mark

Which of the following best describes what a steep drop in cohort retention after the first month usually indicates about a business?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 1. Розділ 2

Запитати АІ

expand

Запитати АІ

ChatGPT

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

bookCohort Retention & Decay Analysis

Свайпніть щоб показати меню

Understanding how long customers remain engaged and when they tend to drop off is crucial for any business looking to improve customer lifetime value and reduce churn. Retention analysis measures the proportion of users from a given cohort—such as those who made their first purchase in a particular month—who return or remain active in subsequent periods. By structuring users into cohorts based on their initial interaction (as you did in the previous chapter), you can track retention rates over time for each group. This enables you to see how quickly engagement decays and to identify patterns in customer loyalty, which is vital for SaaS, e-commerce, and subscription businesses aiming to optimize customer journeys and maximize recurring revenue.

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465
import pandas as pd import numpy as np # Fixed realistic transaction data data = { "user_id": [ # Cohort 2023-01 users 1, 2, 3, 4, 5, 1, 2, 3, 1, 3, # Cohort 2023-02 users 6, 7, 8, 9, 6, 7, 6, # Cohort 2023-03 users 10, 11, 12, 10, 12 ], "order_month": [ # Cohort 1 purchases "2023-01", "2023-01", "2023-01", "2023-01", "2023-01", "2023-02", "2023-02", "2023-02", "2023-03", "2023-03", # Cohort 2 purchases "2023-02", "2023-02", "2023-02", "2023-02", "2023-03", "2023-03", "2023-04", # Cohort 3 purchases "2023-03", "2023-03", "2023-03", # FIXED this line "2023-04", "2023-04" ] } df = pd.DataFrame(data) # Assign cohort_month as the first purchase month df["cohort_month"] = df.groupby("user_id")["order_month"].transform("min") # Convert to datetime df["order_month"] = pd.to_datetime(df["order_month"]) df["cohort_month"] = pd.to_datetime(df["cohort_month"]) # cohort index df["cohort_index"] = ( (df["order_month"].dt.year - df["cohort_month"].dt.year) * 12 + (df["order_month"].dt.month - df["cohort_month"].dt.month) + 1 ) # pivot table cohort_counts = ( df.groupby(["cohort_month", "cohort_index"])["user_id"] .nunique() .reset_index() .pivot(index="cohort_month", columns="cohort_index", values="user_id") ) # retention matrix cohort_sizes = cohort_counts.iloc[:, 0] retention = cohort_counts.divide(cohort_sizes, axis=0).round(3) print(retention)
copy
1234567891011121314151617181920
import seaborn as sns import matplotlib.pyplot as plt # Assume 'retention' DataFrame is already computed as above plt.figure(figsize=(8, 5)) sns.heatmap( retention, annot=True, fmt=".0%", cmap="Blues", cbar=True, linewidths=0.5, linecolor="gray" ) plt.title("Cohort Retention Rates by Month") plt.ylabel("Cohort Month (First Purchase)") plt.xlabel("Months Since First Purchase") plt.yticks(rotation=0) plt.show()
copy

When you examine the retention heatmap, you will notice that retention rates typically start at 100% for each cohort's first month, since all users are present at the beginning. Over time, these rates usually decline as some customers become inactive or churn. The rate and pattern of this decay reveal important business insights:

  • A SaaS company may see a gradual drop each month, indicating steady but manageable churn;
  • An e-commerce business might display sharp drops after the first or second month, reflecting a reliance on one-time buyers.

High retention in later months signals strong product-market fit and customer loyalty, while steep early declines may highlight issues with onboarding, value delivery, or product satisfaction. Recognizing these patterns allows you to target improvements where they will have the most impact on long-term revenue and growth.

question mark

Which of the following best describes what a steep drop in cohort retention after the first month usually indicates about a business?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 1. Розділ 2
some-alt