Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen Cohort Retention & Decay Analysis | Cohort Analysis
Quizzes & Challenges
Quizzes
Challenges
/
Business Analytics and Decision Making with Python

bookCohort Retention & Decay Analysis

Understanding how long customers remain engaged and when they tend to drop off is crucial for any business looking to improve customer lifetime value and reduce churn. Retention analysis measures the proportion of users from a given cohort—such as those who made their first purchase in a particular month—who return or remain active in subsequent periods. By structuring users into cohorts based on their initial interaction (as you did in the previous chapter), you can track retention rates over time for each group. This enables you to see how quickly engagement decays and to identify patterns in customer loyalty, which is vital for SaaS, e-commerce, and subscription businesses aiming to optimize customer journeys and maximize recurring revenue.

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465
import pandas as pd import numpy as np # Fixed realistic transaction data data = { "user_id": [ # Cohort 2023-01 users 1, 2, 3, 4, 5, 1, 2, 3, 1, 3, # Cohort 2023-02 users 6, 7, 8, 9, 6, 7, 6, # Cohort 2023-03 users 10, 11, 12, 10, 12 ], "order_month": [ # Cohort 1 purchases "2023-01", "2023-01", "2023-01", "2023-01", "2023-01", "2023-02", "2023-02", "2023-02", "2023-03", "2023-03", # Cohort 2 purchases "2023-02", "2023-02", "2023-02", "2023-02", "2023-03", "2023-03", "2023-04", # Cohort 3 purchases "2023-03", "2023-03", "2023-03", # FIXED this line "2023-04", "2023-04" ] } df = pd.DataFrame(data) # Assign cohort_month as the first purchase month df["cohort_month"] = df.groupby("user_id")["order_month"].transform("min") # Convert to datetime df["order_month"] = pd.to_datetime(df["order_month"]) df["cohort_month"] = pd.to_datetime(df["cohort_month"]) # cohort index df["cohort_index"] = ( (df["order_month"].dt.year - df["cohort_month"].dt.year) * 12 + (df["order_month"].dt.month - df["cohort_month"].dt.month) + 1 ) # pivot table cohort_counts = ( df.groupby(["cohort_month", "cohort_index"])["user_id"] .nunique() .reset_index() .pivot(index="cohort_month", columns="cohort_index", values="user_id") ) # retention matrix cohort_sizes = cohort_counts.iloc[:, 0] retention = cohort_counts.divide(cohort_sizes, axis=0).round(3) print(retention)
copy
1234567891011121314151617181920
import seaborn as sns import matplotlib.pyplot as plt # Assume 'retention' DataFrame is already computed as above plt.figure(figsize=(8, 5)) sns.heatmap( retention, annot=True, fmt=".0%", cmap="Blues", cbar=True, linewidths=0.5, linecolor="gray" ) plt.title("Cohort Retention Rates by Month") plt.ylabel("Cohort Month (First Purchase)") plt.xlabel("Months Since First Purchase") plt.yticks(rotation=0) plt.show()
copy

When you examine the retention heatmap, you will notice that retention rates typically start at 100% for each cohort's first month, since all users are present at the beginning. Over time, these rates usually decline as some customers become inactive or churn. The rate and pattern of this decay reveal important business insights:

  • A SaaS company may see a gradual drop each month, indicating steady but manageable churn;
  • An e-commerce business might display sharp drops after the first or second month, reflecting a reliance on one-time buyers.

High retention in later months signals strong product-market fit and customer loyalty, while steep early declines may highlight issues with onboarding, value delivery, or product satisfaction. Recognizing these patterns allows you to target improvements where they will have the most impact on long-term revenue and growth.

question mark

Which of the following best describes what a steep drop in cohort retention after the first month usually indicates about a business?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 2

Fragen Sie AI

expand

Fragen Sie AI

ChatGPT

Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen

bookCohort Retention & Decay Analysis

Swipe um das Menü anzuzeigen

Understanding how long customers remain engaged and when they tend to drop off is crucial for any business looking to improve customer lifetime value and reduce churn. Retention analysis measures the proportion of users from a given cohort—such as those who made their first purchase in a particular month—who return or remain active in subsequent periods. By structuring users into cohorts based on their initial interaction (as you did in the previous chapter), you can track retention rates over time for each group. This enables you to see how quickly engagement decays and to identify patterns in customer loyalty, which is vital for SaaS, e-commerce, and subscription businesses aiming to optimize customer journeys and maximize recurring revenue.

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465
import pandas as pd import numpy as np # Fixed realistic transaction data data = { "user_id": [ # Cohort 2023-01 users 1, 2, 3, 4, 5, 1, 2, 3, 1, 3, # Cohort 2023-02 users 6, 7, 8, 9, 6, 7, 6, # Cohort 2023-03 users 10, 11, 12, 10, 12 ], "order_month": [ # Cohort 1 purchases "2023-01", "2023-01", "2023-01", "2023-01", "2023-01", "2023-02", "2023-02", "2023-02", "2023-03", "2023-03", # Cohort 2 purchases "2023-02", "2023-02", "2023-02", "2023-02", "2023-03", "2023-03", "2023-04", # Cohort 3 purchases "2023-03", "2023-03", "2023-03", # FIXED this line "2023-04", "2023-04" ] } df = pd.DataFrame(data) # Assign cohort_month as the first purchase month df["cohort_month"] = df.groupby("user_id")["order_month"].transform("min") # Convert to datetime df["order_month"] = pd.to_datetime(df["order_month"]) df["cohort_month"] = pd.to_datetime(df["cohort_month"]) # cohort index df["cohort_index"] = ( (df["order_month"].dt.year - df["cohort_month"].dt.year) * 12 + (df["order_month"].dt.month - df["cohort_month"].dt.month) + 1 ) # pivot table cohort_counts = ( df.groupby(["cohort_month", "cohort_index"])["user_id"] .nunique() .reset_index() .pivot(index="cohort_month", columns="cohort_index", values="user_id") ) # retention matrix cohort_sizes = cohort_counts.iloc[:, 0] retention = cohort_counts.divide(cohort_sizes, axis=0).round(3) print(retention)
copy
1234567891011121314151617181920
import seaborn as sns import matplotlib.pyplot as plt # Assume 'retention' DataFrame is already computed as above plt.figure(figsize=(8, 5)) sns.heatmap( retention, annot=True, fmt=".0%", cmap="Blues", cbar=True, linewidths=0.5, linecolor="gray" ) plt.title("Cohort Retention Rates by Month") plt.ylabel("Cohort Month (First Purchase)") plt.xlabel("Months Since First Purchase") plt.yticks(rotation=0) plt.show()
copy

When you examine the retention heatmap, you will notice that retention rates typically start at 100% for each cohort's first month, since all users are present at the beginning. Over time, these rates usually decline as some customers become inactive or churn. The rate and pattern of this decay reveal important business insights:

  • A SaaS company may see a gradual drop each month, indicating steady but manageable churn;
  • An e-commerce business might display sharp drops after the first or second month, reflecting a reliance on one-time buyers.

High retention in later months signals strong product-market fit and customer loyalty, while steep early declines may highlight issues with onboarding, value delivery, or product satisfaction. Recognizing these patterns allows you to target improvements where they will have the most impact on long-term revenue and growth.

question mark

Which of the following best describes what a steep drop in cohort retention after the first month usually indicates about a business?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 2
some-alt