Lära Rules for Classification and Regression | Foundations of Rule-Based Machine Learning

When you use rules in machine learning, you can target either classification or regression problems. In classification, rules are designed to predict discrete categories or labels, such as "spam" or "not spam", "high risk" or "low risk". For regression, rules are used to predict a continuous numerical value, such as a house price or the temperature for the next day. Imagine a medical dataset: a rule for classification might state, If age > 60 and blood pressure > 140, then risk = 'high', while a regression rule could be, If age > 60 and blood pressure > 140, then predicted cost = 5000. In both cases, the rule structure is similar, but the outcome type and prediction logic differ.


              123456789101112131415161718192021222324252627282930313233343536373839
            
# Example: Simple rule-based classifier and regressor in Python

import pandas as pd

# Sample data for classification
data_class = pd.DataFrame({
    "age": [22, 45, 61, 38, 70],
    "bp": [120, 135, 150, 142, 160],
    "label": ["low", "low", "high", "low", "high"]
})

# Rule-based classifier: if age > 60 and bp > 140, predict "high", else "low"
def rule_based_classifier(row):
    if row["age"] > 60 and row["bp"] > 140:
        return "high"
    else:
        return "low"

data_class["predicted_label"] = data_class.apply(rule_based_classifier, axis=1)
print("Classification predictions:")
print(data_class[["age", "bp", "predicted_label"]])

# Sample data for regression
data_reg = pd.DataFrame({
    "rooms": [2, 3, 4, 5, 6],
    "sqft": [800, 1200, 1500, 2000, 2500],
    "price": [150, 200, 250, 320, 400]
})

# Rule-based regressor: if rooms >= 5 and sqft > 1800, predict 350, else 200
def rule_based_regressor(row):
    if row["rooms"] >= 5 and row["sqft"] > 1800:
        return 350
    else:
        return 200

data_reg["predicted_price"] = data_reg.apply(rule_based_regressor, axis=1)
print("\nRegression predictions:")
print(data_reg[["rooms", "sqft", "predicted_price"]])

When you compare the rule structures in classification and regression, you will notice that both use logical conditions to trigger a prediction. However, the prediction logic is different: classification rules assign a category (such as "high" or "low"), while regression rules assign a numeric value (such as 350). In the code above, the classifier checks if both age and blood pressure cross certain thresholds and then outputs a label; the regressor checks if both the number of rooms and square footage are above certain values and then outputs a price estimate. This means that while the rule's condition part ("if ...") can look similar, the action part ("then ...") is tailored to the problem type—discrete category for classification, continuous value for regression.

Var allt tydligt?

Tack för dina kommentarer!

Avsnitt 1. Kapitel 5

Fråga AI

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Suggested prompts:

Can you explain more about how rule-based models differ from other machine learning models?

What are some advantages and disadvantages of using rule-based approaches for classification and regression?

Can you show how to modify the rules for different datasets or scenarios?

Awesome!

Completion rate improved to 6.25

Svep för att visa menyn


              123456789101112131415161718192021222324252627282930313233343536373839
            
# Example: Simple rule-based classifier and regressor in Python

import pandas as pd

# Sample data for classification
data_class = pd.DataFrame({
    "age": [22, 45, 61, 38, 70],
    "bp": [120, 135, 150, 142, 160],
    "label": ["low", "low", "high", "low", "high"]
})

# Rule-based classifier: if age > 60 and bp > 140, predict "high", else "low"
def rule_based_classifier(row):
    if row["age"] > 60 and row["bp"] > 140:
        return "high"
    else:
        return "low"

data_class["predicted_label"] = data_class.apply(rule_based_classifier, axis=1)
print("Classification predictions:")
print(data_class[["age", "bp", "predicted_label"]])

# Sample data for regression
data_reg = pd.DataFrame({
    "rooms": [2, 3, 4, 5, 6],
    "sqft": [800, 1200, 1500, 2000, 2500],
    "price": [150, 200, 250, 320, 400]
})

# Rule-based regressor: if rooms >= 5 and sqft > 1800, predict 350, else 200
def rule_based_regressor(row):
    if row["rooms"] >= 5 and row["sqft"] > 1800:
        return 350
    else:
        return 200

data_reg["predicted_price"] = data_reg.apply(rule_based_regressor, axis=1)
print("\nRegression predictions:")
print(data_reg[["rooms", "sqft", "predicted_price"]])

Var allt tydligt?

Tack för dina kommentarer!

Avsnitt 1. Kapitel 5