The **1.5 IQR (Interquartile Range)** rule is a simple but effective method for identifying outliers in a dataset. It's based on the **spread of data around the median** and is commonly used in anomaly detection.

## How to use 1.5 IQR rule

1. Calculate the **IQR**, which is the range between the **75th percentile (Q3)** and the **25th percentile (Q1)** of the dataset;
2. Define the lower threshold as `Q1 - 1.5 * IQR` and the upper threshold as `Q3 + 1.5 * IQR`;
3. Any data point below the lower threshold or above the upper threshold is considered an outlier.

Here is the implementation of this rule: 
``` python
import numpy as np

def detect_outliers_iqr(data):

    data = np.array(data)
    q1 = np.percentile(data, 25)
    q3 = np.percentile(data, 75)
    iqr = q3 - q1
    lower_threshold = q1 - 1.5 * iqr
    upper_threshold = q3 + 1.5 * iqr

    outliers = data[(data < lower_threshold) | (data > upper_threshold)]
    
    return outliers
```
We simply calculate threshold values and condenser all points out of IQR range as outliers.
## 1.5 IQR rule for commonly used distributions

<!DOCTYPE html>
<html>

<head>
    <style>
        table {
            width: 80%;
            margin: 20px auto;
            border-collapse: collapse;
            border: 1px solid #ddd;
        }

        th,
        td {
            padding: 10px;
            text-align: left;
        }

        th {
            background-color: #f2f2f2;
        }

        tr:nth-child(even) {
            background-color: #f2f2f2;
        }

        tr:hover {
            background-color: #ddd;
        }

        th strong {
            font-weight: bold;
        }

        td strong {
            font-weight: bold;
        }
    </style>
</head>

<body>
    <table>
        <tr>
            <th><strong>Pros</strong></th>
            <th><strong>Cons</strong></th>
        </tr>
        <tr>
            <td>Simple and easy-to-understand method for identifying outliers.</td>
            <td>May not work well with <strong>non-symmetric or heavily skewed</strong> data distributions.</td>
        </tr>
        <tr>
            <td><strong>Robust to extreme values</strong> (outliers) in the dataset.</td>
            <td>Requires choosing a <strong>fixed multiplier</strong> (e.g., 1.5) which may not be suitable for all datasets.</td>
        </tr>
        <tr>
            <td>Based on quartiles (Q1 and Q3) and the median, which are <strong>less affected</strong> by outliers.</td>
            <td>Doesn't provide information about the <strong>nature or cause</strong> of outliers.</td>
        </tr>
        <tr>
            <td>Useful for identifying potential outliers that deviate significantly from the majority of the data.</td>
            <td>May classify certain valid data points as outliers if they fall outside the fixed threshold.</td>
        </tr>
        <tr>
            <td>Can be applied to <strong>various types of data</strong>, including univariate and multivariate datasets.</td>
            <td>Doesn't consider the <strong>underlying data distribution</strong> or model assumptions.</td>
        </tr>
    </table>
</body>

</html>


What does an outlier represent in the context of the 1.5 IQR rule?

Anomaly detection is integral to any data scientist's work: high-quality, cleaned, and well-prepared data is the key to success for almost any machine learning problem.

Let's delve into the concept of data anomalies, their classification, and their impact on decision-making. Throughout this exploration, we will gain a comprehensive understanding of these topics, accompanied by real-life examples to illustrate their significance.

Statistical methods play a crucial role in various domains, including finance, cybersecurity, manufacturing, and healthcare, where detecting anomalies can prevent fraud, ensure data quality, and improve overall system reliability. These methods are particularly valuable when dealing with large datasets and complex systems, as they provide a systematic and quantitative approach to anomaly detection.

Now, let's explore how machine learning methods can effectively address anomalies. We will examine various techniques, including clustering, regularization, dropout, and more, that are employed for this purpose.

1.5 IQR Rule

How to use 1.5 IQR rule

1.5 IQR rule for commonly used distributions

Pros and cons of using 1.5 IQR rule

1.5 IQR Rule

How to use 1.5 IQR rule

1.5 IQR rule for commonly used distributions

Pros and cons of using 1.5 IQR rule