Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprenda Identifying Trends and Outliers | Analyzing Workforce Trends
Python for People Analytics

bookIdentifying Trends and Outliers

Deslize para mostrar o menu

Understanding how to identify trends and outliers is a crucial part of People Analytics. Trends reveal patterns over time, such as increasing employee tenure or changing turnover rates, while outliers are data points that deviate significantly from the rest of your data. Both can provide valuable insights when analyzing workforce data, helping you make informed decisions about hiring, retention, and employee development.

123456789101112131415161718
import pandas as pd import numpy as np from scipy import stats # Sample employee tenure data (in years) tenure_data = pd.Series([1, 2, 3, 2, 5, 4, 3, 35, 2, 3, 4, 3, 2, 4, 3]) # Calculate mean and median mean_tenure = tenure_data.mean() median_tenure = tenure_data.median() # Identify outliers using Z-score z_scores = np.abs(stats.zscore(tenure_data)) outliers = tenure_data[z_scores > 2] print("Mean tenure:", mean_tenure) print("Median tenure:", median_tenure) print("Outliers in tenure data:", outliers.values)
copy

Outliers can have a significant impact on HR decisions. For example, an unusually long tenure might indicate a unique career path or data entry error, while a very short tenure could signal issues with onboarding or job satisfaction. If outliers are not handled properly, they can skew averages and trends, leading to misleading conclusions. Common approaches to handling outliers include verifying data accuracy, excluding them from certain analyses, or using robust statistical measures like the median instead of the mean.

123456789101112131415
import matplotlib.pyplot as plt # Plot tenure distribution plt.figure(figsize=(8, 4)) plt.hist(tenure_data, bins=range(1, 40, 2), color='skyblue', edgecolor='black', alpha=0.7) plt.xlabel('Tenure (years)') plt.ylabel('Number of Employees') plt.title('Employee Tenure Distribution') # Highlight outliers for outlier in outliers: plt.axvline(outlier, color='red', linestyle='dashed', linewidth=2, label='Outlier' if outlier == outliers.iloc[0] else "") plt.legend() plt.show()
copy

1. What is an outlier in the context of HR data?

2. Fill in the blank: The ____ function in scipy can help identify statistical outliers.

3. Why is it important to identify trends in workforce data?

question mark

What is an outlier in the context of HR data?

Select the correct answer

question-icon

Fill in the blank: The ____ function in scipy can help identify statistical outliers.

question mark

Why is it important to identify trends in workforce data?

Select the correct answer

Tudo estava claro?

Como podemos melhorá-lo?

Obrigado pelo seu feedback!

Seção 2. Capítulo 6

Pergunte à IA

expand

Pergunte à IA

ChatGPT

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo

Seção 2. Capítulo 6
some-alt