Building a Simple Attrition Model
Swipe to show menu
Understanding when employees might leave an organizationโknown as employee attritionโis a key challenge in people analytics. Predicting attrition helps HR teams intervene early and retain valuable talent. One of the most common approaches for this task is logistic regression, a machine learning method designed for binary classification problems. In the context of attrition, you are trying to predict whether an employee will stay (0) or leave (1), based on various features like job satisfaction, years at the company, and more.
123456789101112131415161718192021import numpy as np import pandas as pd from sklearn.linear_model import LogisticRegression # Hardcoded HR data: each row is an employee # Features: Satisfaction, YearsAtCompany, Age # Target: Attrition (1=Yes, 0=No) data = { 'Satisfaction': [0.9, 0.4, 0.3, 0.7, 0.2, 0.8, 0.5, 0.6], 'YearsAtCompany': [5, 2, 1, 6, 1, 7, 3, 4], 'Age': [34, 28, 25, 45, 23, 50, 30, 38], 'Attrition': [0, 1, 1, 0, 1, 0, 1, 0] } df = pd.DataFrame(data) X = df[['Satisfaction', 'YearsAtCompany', 'Age']] y = df['Attrition'] # Fit logistic regression model model = LogisticRegression() model.fit(X, y)
After fitting the logistic regression model, you obtain coefficients for each feature. These coefficients represent the relationship between each feature and the probability of attrition. A positive coefficient means that as the feature increases, the likelihood of attrition increases. A negative coefficient means the opposite. For example, if Satisfaction has a negative coefficient, higher satisfaction is associated with a lower probability of leaving. The model predicts the probability that an employee will leave, and by default, predicts attrition if that probability is greater than 0.5.
1234567891011# Predict attrition for the same employees predictions = model.predict(X) probabilities = model.predict_proba(X)[:, 1] # Evaluate accuracy from sklearn.metrics import accuracy_score accuracy = accuracy_score(y, predictions) print("Predicted Attrition:", predictions) print("Predicted Probabilities:", probabilities) print("Model Accuracy:", accuracy)
1. What type of problem is employee attrition prediction?
2. Fill in the blank: Logistic regression is used for ____ classification problems.
3. What does model accuracy tell you about your predictions?
Thanks for your feedback!
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat