single
Challenge: Mapping the Lifecycle of a SpaceStream Explorer
Swipe to show menu
You are now tasked with calculating advanced retention metrics for SpaceStream, an intergalactic holovision service. As the Lead Data Analyst, you will analyze a cohort of 5 users over three months, tracking who stays loyal and who drifts away. Your goal is to compute three critical metrics for each month: Retention Rate, Churn Rate, and Survival Rate.
Begin by examining the provided dataset, where each user is marked as active (1) or inactive (0) for each month. The columns month_0, month_1, and month_2 represent activity across three consecutive months. Your solution will require you to use pandas to process this dataset and extract the necessary metrics for each month.
1234567891011121314151617181920212223242526import pandas as pd data = { "user_id": [1, 2, 3, 4, 5], "month_0": [1, 1, 1, 1, 1], # Everyone starts active "month_1": [1, 0, 1, 0, 1], # 3 users active "month_2": [1, 0, 0, 0, 0], # 1 user active } df = pd.DataFrame(data) # Calculating retention rate: fraction of original cohort active in each month cohort_size = len(df) retention_rate = [df[f"month_{i}"].sum() / cohort_size for i in range(3)] # Calculating churn rate: 1 - retention rate churn_rate = [1 - r for r in retention_rate] # Calculating survival rate: fraction of users still active in ALL months up to i survival_rate = [] for i in range(3): still_active = df[[f"month_{j}" for j in range(i + 1)]].all(axis=1).sum() survival_rate.append(still_active / cohort_size) print("retention_rate:", retention_rate) print("churn_rate:", churn_rate) print("survival_rate:", survival_rate)
This code calculates the required metrics for each month. The retention rate measures what fraction of the original cohort is active in a given month. The churn rate is simply one minus the retention rate, indicating the proportion that is no longer active. The survival rate checks for users who have remained continuously active from the beginning up to the current month - requiring a user to have a 1 in every month so far.
Swipe to start coding
Write a Python function called calculate_cohort_metrics(df) that takes in a DataFrame with the same structure as above and returns three lists: retention_rate, churn_rate, and survival_rate for each month. Your function should:
- Accept a DataFrame where each row is a user and each column after
user_idis a month (e.g.,month_0,month_1, ...). - Calculate retention rate for each month as the fraction of cohort users active in that month.
- Calculate churn rate for each month as one minus the retention rate.
- Calculate survival rate for each month as the fraction of users who were active in all months up to and including that month.
- Return the three lists in the order: retention_rate, churn_rate, survival_rate.
Solution
Thanks for your feedback!
single
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat