Leer Identifying Trends and Outliers | Data Analysis for Operations Decisions

Veeg om het menu te tonen

Understanding how to identify trends and outliers in operational data is crucial for making informed decisions as an operations manager. Trends represent the general direction in which a metric, such as sales or inventory levels, is moving over time. Recognizing upward or downward trends allows you to anticipate changes, allocate resources more effectively, and adjust strategies to capitalize on positive patterns or mitigate negative ones. Outliers, on the other hand, are data points that deviate significantly from the rest of the dataset. These anomalies can signal issues such as errors in data entry, unexpected demand spikes, supply chain disruptions, or process breakdowns. Ignoring outliers may lead to incorrect conclusions, while addressing them can help uncover hidden problems or opportunities.


              12345678910111213
            
import pandas as pd

# Example sales data for two weeks
data = {
    "date": pd.date_range(start="2024-06-01", periods=14, freq="D"),
    "sales": [120, 135, 150, 160, 155, 140, 130, 155, 170, 180, 175, 160, 150, 165]
}
df = pd.DataFrame(data)

# Calculate a 3-day moving average to smooth out daily fluctuations and reveal trends
df["sales_ma_3"] = df["sales"].rolling(window=3).mean()

print(df[["date", "sales", "sales_ma_3"]])

To detect outliers in operational data, such as inventory levels or daily sales, you can use several methods. One common approach is to calculate the mean and standard deviation of the dataset. Data points that fall outside a certain number of standard deviations from the mean are considered outliers. For example, if a day's sales volume is much higher or lower than the average plus or minus two standard deviations, it may indicate an unusual event or error that requires further investigation. Other techniques include using interquartile range (IQR) or visualizing data with boxplots to quickly spot anomalies.


              123456789101112131415161718
            
import pandas as pd

# Simulated daily order volume for two weeks
data = {
    "date": pd.date_range(start="2024-06-01", periods=14, freq="D"),
    "orders": [40, 42, 39, 41, 150, 38, 40, 43, 39, 41, 42, 37, 200, 39]
}
df = pd.DataFrame(data)

# Calculate mean and standard deviation
mean_orders = df["orders"].mean()
std_orders = df["orders"].std()

# Identify outlier days (orders more than 2 standard deviations from the mean)
df["is_outlier"] = ((df["orders"] > mean_orders + 2 * std_orders) |
                    (df["orders"] < mean_orders - 2 * std_orders))

print(df[["date", "orders", "is_outlier"]])

1. What is a moving average and how does it help identify trends?

2. How can outliers affect operational decisions?

3. Which pandas method can be used to calculate the mean of a column?

Was alles duidelijk?

Bedankt voor je feedback!

Sectie 2. Hoofdstuk 4

Vraag AI

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

Sectie 2. Hoofdstuk 4