Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer Metrics and Contamination Trade-Offs | Evaluation and Practical Comparison
Outlier and Novelty Detection in Practice

bookMetrics and Contamination Trade-Offs

Understanding how to evaluate outlier and novelty detection models is crucial for practical deployment. Two of the most widely used metrics are ROC-AUC (Receiver Operating Characteristic - Area Under Curve) and Precision-Recall curves. Each provides different insights into model performance, and their interpretation can shift depending on the level of contamination—meaning the proportion of true anomalies present in your dataset.

ROC-AUC measures the model's ability to distinguish between classes across all possible thresholds. The ROC curve plots the true positive rate (TPR) against the false positive rate (FPR), and the area under this curve summarizes the performance: a value of 1.0 indicates perfect separation, while 0.5 suggests random guessing.

Precision-Recall curves focus on the trade-off between precision (the proportion of detected anomalies that are actually true anomalies) and recall (the proportion of true anomalies that are detected). This metric is especially informative when dealing with highly imbalanced datasets, which is common in outlier detection, since anomalies are typically rare.

Contamination, or the proportion of outliers in the data, can greatly influence the evaluation. High contamination can inflate performance metrics, making a model appear more effective than it is on truly rare anomalies. Conversely, very low contamination can make it difficult for any model to achieve high recall without sacrificing precision. Selecting the right evaluation metric and understanding the impact of contamination is essential for drawing meaningful conclusions about model performance.

Note
Note

The choice between ROC-AUC and Precision-Recall depends on your application's priorities and the rarity of anomalies. In highly imbalanced datasets, Precision-Recall curves often provide a more realistic picture of model effectiveness. Always consider how contamination in your data may bias these metrics, and use domain knowledge to set contamination levels that reflect your real-world scenario.

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061
import numpy as np import matplotlib.pyplot as plt from sklearn.ensemble import IsolationForest from sklearn.neighbors import LocalOutlierFactor from sklearn.svm import OneClassSVM from sklearn.metrics import roc_curve, auc, precision_recall_curve, average_precision_score from sklearn.datasets import make_blobs # Generate synthetic data with contamination X, y_true = make_blobs(n_samples=300, centers=1, cluster_std=0.60, random_state=42) rng = np.random.RandomState(42) n_outliers = 30 X_outliers = rng.uniform(low=-6, high=6, size=(n_outliers, 2)) X_full = np.vstack([X, X_outliers]) y_full = np.hstack([np.zeros(len(X)), np.ones(n_outliers)]) # Fit models models = { "Isolation Forest": IsolationForest(contamination=0.1, random_state=42), "Local Outlier Factor": LocalOutlierFactor(n_neighbors=20, contamination=0.1, novelty=True), "One-Class SVM": OneClassSVM(nu=0.1, kernel="rbf", gamma=0.1) } plt.figure(figsize=(14, 6)) for i, (name, model) in enumerate(models.items()): # Fit and score if name == "Local Outlier Factor": model.fit(X_full) scores = -model.decision_function(X_full) else: model.fit(X_full) scores = -model.decision_function(X_full) # ROC Curve fpr, tpr, _ = roc_curve(y_full, scores) roc_auc = auc(fpr, tpr) # Precision-Recall Curve precision, recall, _ = precision_recall_curve(y_full, scores) avg_precision = average_precision_score(y_full, scores) # Plot ROC plt.subplot(2, 3, i+1) plt.plot(fpr, tpr, label=f"AUC = {roc_auc:.2f}") plt.plot([0, 1], [0, 1], 'k--', lw=1) plt.title(f"ROC: {name}") plt.xlabel("False Positive Rate") plt.ylabel("True Positive Rate") plt.legend() # Plot PR plt.subplot(2, 3, i+4) plt.plot(recall, precision, label=f"AP = {avg_precision:.2f}") plt.title(f"Precision-Recall: {name}") plt.xlabel("Recall") plt.ylabel("Precision") plt.legend() plt.tight_layout() plt.show()
copy
question mark

Which statement best describes the impact of contamination on evaluation metrics in outlier detection?

Select the correct answer

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 6. Hoofdstuk 2

Vraag AI

expand

Vraag AI

ChatGPT

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

Suggested prompts:

Can you explain when to prefer ROC-AUC over Precision-Recall for outlier detection?

How does changing the contamination parameter affect these evaluation metrics?

Can you help interpret the plots generated by this code?

Awesome!

Completion rate improved to 4.55

bookMetrics and Contamination Trade-Offs

Veeg om het menu te tonen

Understanding how to evaluate outlier and novelty detection models is crucial for practical deployment. Two of the most widely used metrics are ROC-AUC (Receiver Operating Characteristic - Area Under Curve) and Precision-Recall curves. Each provides different insights into model performance, and their interpretation can shift depending on the level of contamination—meaning the proportion of true anomalies present in your dataset.

ROC-AUC measures the model's ability to distinguish between classes across all possible thresholds. The ROC curve plots the true positive rate (TPR) against the false positive rate (FPR), and the area under this curve summarizes the performance: a value of 1.0 indicates perfect separation, while 0.5 suggests random guessing.

Precision-Recall curves focus on the trade-off between precision (the proportion of detected anomalies that are actually true anomalies) and recall (the proportion of true anomalies that are detected). This metric is especially informative when dealing with highly imbalanced datasets, which is common in outlier detection, since anomalies are typically rare.

Contamination, or the proportion of outliers in the data, can greatly influence the evaluation. High contamination can inflate performance metrics, making a model appear more effective than it is on truly rare anomalies. Conversely, very low contamination can make it difficult for any model to achieve high recall without sacrificing precision. Selecting the right evaluation metric and understanding the impact of contamination is essential for drawing meaningful conclusions about model performance.

Note
Note

The choice between ROC-AUC and Precision-Recall depends on your application's priorities and the rarity of anomalies. In highly imbalanced datasets, Precision-Recall curves often provide a more realistic picture of model effectiveness. Always consider how contamination in your data may bias these metrics, and use domain knowledge to set contamination levels that reflect your real-world scenario.

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061
import numpy as np import matplotlib.pyplot as plt from sklearn.ensemble import IsolationForest from sklearn.neighbors import LocalOutlierFactor from sklearn.svm import OneClassSVM from sklearn.metrics import roc_curve, auc, precision_recall_curve, average_precision_score from sklearn.datasets import make_blobs # Generate synthetic data with contamination X, y_true = make_blobs(n_samples=300, centers=1, cluster_std=0.60, random_state=42) rng = np.random.RandomState(42) n_outliers = 30 X_outliers = rng.uniform(low=-6, high=6, size=(n_outliers, 2)) X_full = np.vstack([X, X_outliers]) y_full = np.hstack([np.zeros(len(X)), np.ones(n_outliers)]) # Fit models models = { "Isolation Forest": IsolationForest(contamination=0.1, random_state=42), "Local Outlier Factor": LocalOutlierFactor(n_neighbors=20, contamination=0.1, novelty=True), "One-Class SVM": OneClassSVM(nu=0.1, kernel="rbf", gamma=0.1) } plt.figure(figsize=(14, 6)) for i, (name, model) in enumerate(models.items()): # Fit and score if name == "Local Outlier Factor": model.fit(X_full) scores = -model.decision_function(X_full) else: model.fit(X_full) scores = -model.decision_function(X_full) # ROC Curve fpr, tpr, _ = roc_curve(y_full, scores) roc_auc = auc(fpr, tpr) # Precision-Recall Curve precision, recall, _ = precision_recall_curve(y_full, scores) avg_precision = average_precision_score(y_full, scores) # Plot ROC plt.subplot(2, 3, i+1) plt.plot(fpr, tpr, label=f"AUC = {roc_auc:.2f}") plt.plot([0, 1], [0, 1], 'k--', lw=1) plt.title(f"ROC: {name}") plt.xlabel("False Positive Rate") plt.ylabel("True Positive Rate") plt.legend() # Plot PR plt.subplot(2, 3, i+4) plt.plot(recall, precision, label=f"AP = {avg_precision:.2f}") plt.title(f"Precision-Recall: {name}") plt.xlabel("Recall") plt.ylabel("Precision") plt.legend() plt.tight_layout() plt.show()
copy
question mark

Which statement best describes the impact of contamination on evaluation metrics in outlier detection?

Select the correct answer

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 6. Hoofdstuk 2
some-alt