Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Вивчайте Calibrating Rule-Based Models | Hybrid and Applied Rule-Based Forecasting
Quizzes & Challenges
Quizzes
Challenges
/
Rule-Based Machine Learning Systems

bookCalibrating Rule-Based Models

When you use rule-based models for machine learning, you often want not just crisp decisions, but reliable probability estimates—such as the chance that a transaction is fraudulent or a customer will churn. However, rule-based models, especially those designed for interpretability, can produce probability outputs that are poorly aligned with true outcome frequencies. This misalignment is called miscalibration. Calibrating a rule-based model means adjusting its probability outputs so that, for instance, predictions of 70% probability actually correspond to about 70% positive outcomes in reality. Without calibration, you risk making decisions based on misleading confidence scores, which can have serious consequences in applications like medical diagnosis or financial risk assessment.

123456789101112131415161718192021222324252627282930313233343536373839404142
import numpy as np from sklearn.datasets import make_classification from sklearn.tree import DecisionTreeClassifier from sklearn.calibration import CalibratedClassifierCV, calibration_curve, FrozenEstimator from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt # Create a small synthetic dataset X, y = make_classification(n_samples=300, n_features=5, random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Train a simple rule-based classifier (Decision Tree) dt = DecisionTreeClassifier(max_depth=3, random_state=42) dt.fit(X_train, y_train) # Calibrate the classifier using the new FrozenEstimator API calibrated_dt = CalibratedClassifierCV( estimator=FrozenEstimator(dt), method="sigmoid" ) calibrated_dt.fit(X_train, y_train) # Get probabilities probs_uncalibrated = dt.predict_proba(X_test)[:, 1] probs_calibrated = calibrated_dt.predict_proba(X_test)[:, 1] # Plot calibration curves fraction_of_positives_uncal, mean_predicted_value_uncal = calibration_curve( y_test, probs_uncalibrated, n_bins=10) fraction_of_positives_cal, mean_predicted_value_cal = calibration_curve( y_test, probs_calibrated, n_bins=10) plt.figure(figsize=(7, 5)) plt.plot(mean_predicted_value_uncal, fraction_of_positives_uncal, "s-", label="Uncalibrated") plt.plot(mean_predicted_value_cal, fraction_of_positives_cal, "o-", label="Calibrated") plt.plot([0, 1], [0, 1], "k:", label="Perfectly calibrated") plt.xlabel("Mean predicted probability") plt.ylabel("Fraction of positives") plt.legend() plt.title("Calibration Curves: Rule-Based Classifier") plt.show()
copy

Calibration improves the reliability of your rule-based model’s probability estimates, which in turn makes your decisions more trustworthy. In the code above, you first train a decision tree classifier, which acts as a rule-based model. Then, you calibrate its probability outputs using CalibratedClassifierCV with Platt scaling. The resulting calibration curves show how well the predicted probabilities align with actual outcomes. When a model is well-calibrated, its probability estimates can be used directly for setting decision thresholds, prioritizing cases, or communicating risk to stakeholders. This process increases trust in the model’s outputs, especially in scenarios where acting on overconfident or underconfident predictions could be costly or dangerous.

question mark

What is the main purpose of calibrating a rule-based model's probability outputs?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 3. Розділ 2

Запитати АІ

expand

Запитати АІ

ChatGPT

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

Suggested prompts:

Can you explain what Platt scaling is and why it's used here?

How do I interpret the calibration curve plot?

Are there other calibration methods besides Platt scaling?

Awesome!

Completion rate improved to 6.25

bookCalibrating Rule-Based Models

Свайпніть щоб показати меню

When you use rule-based models for machine learning, you often want not just crisp decisions, but reliable probability estimates—such as the chance that a transaction is fraudulent or a customer will churn. However, rule-based models, especially those designed for interpretability, can produce probability outputs that are poorly aligned with true outcome frequencies. This misalignment is called miscalibration. Calibrating a rule-based model means adjusting its probability outputs so that, for instance, predictions of 70% probability actually correspond to about 70% positive outcomes in reality. Without calibration, you risk making decisions based on misleading confidence scores, which can have serious consequences in applications like medical diagnosis or financial risk assessment.

123456789101112131415161718192021222324252627282930313233343536373839404142
import numpy as np from sklearn.datasets import make_classification from sklearn.tree import DecisionTreeClassifier from sklearn.calibration import CalibratedClassifierCV, calibration_curve, FrozenEstimator from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt # Create a small synthetic dataset X, y = make_classification(n_samples=300, n_features=5, random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Train a simple rule-based classifier (Decision Tree) dt = DecisionTreeClassifier(max_depth=3, random_state=42) dt.fit(X_train, y_train) # Calibrate the classifier using the new FrozenEstimator API calibrated_dt = CalibratedClassifierCV( estimator=FrozenEstimator(dt), method="sigmoid" ) calibrated_dt.fit(X_train, y_train) # Get probabilities probs_uncalibrated = dt.predict_proba(X_test)[:, 1] probs_calibrated = calibrated_dt.predict_proba(X_test)[:, 1] # Plot calibration curves fraction_of_positives_uncal, mean_predicted_value_uncal = calibration_curve( y_test, probs_uncalibrated, n_bins=10) fraction_of_positives_cal, mean_predicted_value_cal = calibration_curve( y_test, probs_calibrated, n_bins=10) plt.figure(figsize=(7, 5)) plt.plot(mean_predicted_value_uncal, fraction_of_positives_uncal, "s-", label="Uncalibrated") plt.plot(mean_predicted_value_cal, fraction_of_positives_cal, "o-", label="Calibrated") plt.plot([0, 1], [0, 1], "k:", label="Perfectly calibrated") plt.xlabel("Mean predicted probability") plt.ylabel("Fraction of positives") plt.legend() plt.title("Calibration Curves: Rule-Based Classifier") plt.show()
copy

Calibration improves the reliability of your rule-based model’s probability estimates, which in turn makes your decisions more trustworthy. In the code above, you first train a decision tree classifier, which acts as a rule-based model. Then, you calibrate its probability outputs using CalibratedClassifierCV with Platt scaling. The resulting calibration curves show how well the predicted probabilities align with actual outcomes. When a model is well-calibrated, its probability estimates can be used directly for setting decision thresholds, prioritizing cases, or communicating risk to stakeholders. This process increases trust in the model’s outputs, especially in scenarios where acting on overconfident or underconfident predictions could be costly or dangerous.

question mark

What is the main purpose of calibrating a rule-based model's probability outputs?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 3. Розділ 2
some-alt