Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Median Absolute Deviation | Statistical Methods in Anomaly Detection
course content

Зміст курсу

Data Anomaly Detection

Median Absolute DeviationMedian Absolute Deviation

The MAD (Median Absolute Deviation) rule is a statistical outlier detection method that uses the median and the median absolute deviation as robust estimators to identify outliers in a dataset.

It is particularly useful when dealing with data that may not follow a normal distribution or when there are potential outliers that can significantly impact the mean and standard deviation.

How to use MAD rule

  1. Calculate the Median: Compute the median of the dataset, which is the middle value when the data is sorted;
  2. Calculate the Median Absolute Deviation (MAD): For each data point, find the absolute difference between the data point and the median. The MAD is the median of these absolute differences;
  3. Define a Threshold: Choose a threshold value (usually a constant, e.g., 2 or 3 times the MAD) to determine how far a data point can deviate from the median before being considered an outlier;
  4. Identify Outliers: Any data point that has an absolute difference from the median greater than the threshold is considered an outlier.

    Note

    Mathematically, the absolute difference between two values, A and B, is denoted as |A - B|, where "|" represents the absolute value function. This function returns the positive value of the difference between A and B.

MAD rule implementation

MAD vs 1.5 IQR rule

Rule Description Pros Cons
MAD Rule The MAD (Median Absolute Deviation) rule uses the median and the median absolute deviation to identify outliers.
  • Robust to outliers and resistant to extreme values.
  • Applicable to non-normal and skewed data distributions.
  • Requires choosing a threshold, which can be somewhat arbitrary.
  • May not perform well with small datasets.
1.5 IQR Rule The 1.5 IQR (Interquartile Range) rule uses the IQR to identify outliers based on quartiles.
  • Simple to understand and apply.
  • Provides a clear definition of outliers based on quartiles.
  • Sensitive to extreme values and outliers.
  • May not work well for non-normal and skewed data distributions.

What is the main advantage of using MAD for outlier detection?

Виберіть правильну відповідь

Все було зрозуміло?

Секція 2. Розділ 5
course content

Зміст курсу

Data Anomaly Detection

Median Absolute DeviationMedian Absolute Deviation

The MAD (Median Absolute Deviation) rule is a statistical outlier detection method that uses the median and the median absolute deviation as robust estimators to identify outliers in a dataset.

It is particularly useful when dealing with data that may not follow a normal distribution or when there are potential outliers that can significantly impact the mean and standard deviation.

How to use MAD rule

  1. Calculate the Median: Compute the median of the dataset, which is the middle value when the data is sorted;
  2. Calculate the Median Absolute Deviation (MAD): For each data point, find the absolute difference between the data point and the median. The MAD is the median of these absolute differences;
  3. Define a Threshold: Choose a threshold value (usually a constant, e.g., 2 or 3 times the MAD) to determine how far a data point can deviate from the median before being considered an outlier;
  4. Identify Outliers: Any data point that has an absolute difference from the median greater than the threshold is considered an outlier.

    Note

    Mathematically, the absolute difference between two values, A and B, is denoted as |A - B|, where "|" represents the absolute value function. This function returns the positive value of the difference between A and B.

MAD rule implementation

MAD vs 1.5 IQR rule

Rule Description Pros Cons
MAD Rule The MAD (Median Absolute Deviation) rule uses the median and the median absolute deviation to identify outliers.
  • Robust to outliers and resistant to extreme values.
  • Applicable to non-normal and skewed data distributions.
  • Requires choosing a threshold, which can be somewhat arbitrary.
  • May not perform well with small datasets.
1.5 IQR Rule The 1.5 IQR (Interquartile Range) rule uses the IQR to identify outliers based on quartiles.
  • Simple to understand and apply.
  • Provides a clear definition of outliers based on quartiles.
  • Sensitive to extreme values and outliers.
  • May not work well for non-normal and skewed data distributions.

What is the main advantage of using MAD for outlier detection?

Виберіть правильну відповідь

Все було зрозуміло?

Секція 2. Розділ 5
some-alt