Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen Calibrating Rule-Based Models | Hybrid and Applied Rule-Based Forecasting
Quizzes & Challenges
Quizzes
Challenges
/
Rule-Based Machine Learning Systems

bookCalibrating Rule-Based Models

When you use rule-based models for machine learning, you often want not just crisp decisions, but reliable probability estimates—such as the chance that a transaction is fraudulent or a customer will churn. However, rule-based models, especially those designed for interpretability, can produce probability outputs that are poorly aligned with true outcome frequencies. This misalignment is called miscalibration. Calibrating a rule-based model means adjusting its probability outputs so that, for instance, predictions of 70% probability actually correspond to about 70% positive outcomes in reality. Without calibration, you risk making decisions based on misleading confidence scores, which can have serious consequences in applications like medical diagnosis or financial risk assessment.

123456789101112131415161718192021222324252627282930313233343536373839404142
import numpy as np from sklearn.datasets import make_classification from sklearn.tree import DecisionTreeClassifier from sklearn.calibration import CalibratedClassifierCV, calibration_curve, FrozenEstimator from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt # Create a small synthetic dataset X, y = make_classification(n_samples=300, n_features=5, random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Train a simple rule-based classifier (Decision Tree) dt = DecisionTreeClassifier(max_depth=3, random_state=42) dt.fit(X_train, y_train) # Calibrate the classifier using the new FrozenEstimator API calibrated_dt = CalibratedClassifierCV( estimator=FrozenEstimator(dt), method="sigmoid" ) calibrated_dt.fit(X_train, y_train) # Get probabilities probs_uncalibrated = dt.predict_proba(X_test)[:, 1] probs_calibrated = calibrated_dt.predict_proba(X_test)[:, 1] # Plot calibration curves fraction_of_positives_uncal, mean_predicted_value_uncal = calibration_curve( y_test, probs_uncalibrated, n_bins=10) fraction_of_positives_cal, mean_predicted_value_cal = calibration_curve( y_test, probs_calibrated, n_bins=10) plt.figure(figsize=(7, 5)) plt.plot(mean_predicted_value_uncal, fraction_of_positives_uncal, "s-", label="Uncalibrated") plt.plot(mean_predicted_value_cal, fraction_of_positives_cal, "o-", label="Calibrated") plt.plot([0, 1], [0, 1], "k:", label="Perfectly calibrated") plt.xlabel("Mean predicted probability") plt.ylabel("Fraction of positives") plt.legend() plt.title("Calibration Curves: Rule-Based Classifier") plt.show()
copy

Calibration improves the reliability of your rule-based model’s probability estimates, which in turn makes your decisions more trustworthy. In the code above, you first train a decision tree classifier, which acts as a rule-based model. Then, you calibrate its probability outputs using CalibratedClassifierCV with Platt scaling. The resulting calibration curves show how well the predicted probabilities align with actual outcomes. When a model is well-calibrated, its probability estimates can be used directly for setting decision thresholds, prioritizing cases, or communicating risk to stakeholders. This process increases trust in the model’s outputs, especially in scenarios where acting on overconfident or underconfident predictions could be costly or dangerous.

question mark

What is the main purpose of calibrating a rule-based model's probability outputs?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 3. Kapitel 2

Fragen Sie AI

expand

Fragen Sie AI

ChatGPT

Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen

Awesome!

Completion rate improved to 6.25

bookCalibrating Rule-Based Models

Swipe um das Menü anzuzeigen

When you use rule-based models for machine learning, you often want not just crisp decisions, but reliable probability estimates—such as the chance that a transaction is fraudulent or a customer will churn. However, rule-based models, especially those designed for interpretability, can produce probability outputs that are poorly aligned with true outcome frequencies. This misalignment is called miscalibration. Calibrating a rule-based model means adjusting its probability outputs so that, for instance, predictions of 70% probability actually correspond to about 70% positive outcomes in reality. Without calibration, you risk making decisions based on misleading confidence scores, which can have serious consequences in applications like medical diagnosis or financial risk assessment.

123456789101112131415161718192021222324252627282930313233343536373839404142
import numpy as np from sklearn.datasets import make_classification from sklearn.tree import DecisionTreeClassifier from sklearn.calibration import CalibratedClassifierCV, calibration_curve, FrozenEstimator from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt # Create a small synthetic dataset X, y = make_classification(n_samples=300, n_features=5, random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Train a simple rule-based classifier (Decision Tree) dt = DecisionTreeClassifier(max_depth=3, random_state=42) dt.fit(X_train, y_train) # Calibrate the classifier using the new FrozenEstimator API calibrated_dt = CalibratedClassifierCV( estimator=FrozenEstimator(dt), method="sigmoid" ) calibrated_dt.fit(X_train, y_train) # Get probabilities probs_uncalibrated = dt.predict_proba(X_test)[:, 1] probs_calibrated = calibrated_dt.predict_proba(X_test)[:, 1] # Plot calibration curves fraction_of_positives_uncal, mean_predicted_value_uncal = calibration_curve( y_test, probs_uncalibrated, n_bins=10) fraction_of_positives_cal, mean_predicted_value_cal = calibration_curve( y_test, probs_calibrated, n_bins=10) plt.figure(figsize=(7, 5)) plt.plot(mean_predicted_value_uncal, fraction_of_positives_uncal, "s-", label="Uncalibrated") plt.plot(mean_predicted_value_cal, fraction_of_positives_cal, "o-", label="Calibrated") plt.plot([0, 1], [0, 1], "k:", label="Perfectly calibrated") plt.xlabel("Mean predicted probability") plt.ylabel("Fraction of positives") plt.legend() plt.title("Calibration Curves: Rule-Based Classifier") plt.show()
copy

Calibration improves the reliability of your rule-based model’s probability estimates, which in turn makes your decisions more trustworthy. In the code above, you first train a decision tree classifier, which acts as a rule-based model. Then, you calibrate its probability outputs using CalibratedClassifierCV with Platt scaling. The resulting calibration curves show how well the predicted probabilities align with actual outcomes. When a model is well-calibrated, its probability estimates can be used directly for setting decision thresholds, prioritizing cases, or communicating risk to stakeholders. This process increases trust in the model’s outputs, especially in scenarios where acting on overconfident or underconfident predictions could be costly or dangerous.

question mark

What is the main purpose of calibrating a rule-based model's probability outputs?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 3. Kapitel 2
some-alt