Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära Evaluating Classification Models | Section
Predictive Modeling with Tidymodels in R

bookEvaluating Classification Models

Svep för att visa menyn

Evaluating classification models requires understanding several key metrics that reveal different aspects of model performance. The most common metrics include accuracy, which measures the proportion of correct predictions out of all predictions; precision, which is the ratio of true positives to the sum of true positives and false positives; recall, which is the ratio of true positives to the sum of true positives and false negatives; and ROC AUC, which summarizes the trade-off between sensitivity and specificity across different thresholds. These metrics help you judge how well your classification model distinguishes between classes and how it might perform in real-world scenarios.

1234567891011121314151617181920212223242526272829303132333435363738
options(crayon.enabled = FALSE) library(tidymodels) # 1. Simulate the prediction results (what collect_predictions usually returns) set.seed(42) predictions_df <- tibble( # True values (classes) target = factor(sample(c("Class1", "Class2"), 100, replace = TRUE)), # Probability for Class1 (random numbers between 0 and 1) .pred_Class1 = runif(100) ) %>% mutate( # Probability for Class2 .pred_Class2 = 1 - .pred_Class1, # Final predicted class (Class1 if prob > 0.5, else Class2) .pred_class = factor(ifelse(.pred_Class1 > 0.5, "Class1", "Class2"), levels = c("Class1", "Class2")) ) # 2. Generate the confusion matrix conf_mat_res <- predictions_df %>% conf_mat(truth = target, estimate = .pred_class) print("--- Confusion Matrix ---") print(conf_mat_res) # 3. Generate ROC curve and plot it roc_res <- predictions_df %>% roc_curve(truth = target, .pred_Class1) %>% autoplot() print(roc_res) # 4. Calculate ROC AUC roc_auc_res <- predictions_df %>% roc_auc(truth = target, .pred_Class1) print("--- ROC AUC ---") print(roc_auc_res)
copy

When you interpret classification metrics, consider the context and the consequences of different types of errors. High accuracy may not always mean good performance, especially with imbalanced classes. Precision is crucial when the cost of a false positive is high, while recall is more important when missing a positive instance is costly. The ROC curve helps visualize the model's ability to discriminate between classes at various thresholds, and the AUC provides a single-value summary of this discrimination. Comparing models using these metrics allows you to select the most suitable one for your specific application, balancing trade-offs between precision, recall, and overall accuracy.

question mark

A binary classifier predicts 90 positive and 10 negative cases. The true labels are 80 positive and 20 negative. The confusion matrix shows:

  • 75 true positives
  • 5 false positives
  • 15 false negatives
  • 5 true negatives

What is the accuracy of this model?

Vänligen välj det korrekta svaret

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 1. Kapitel 7

Fråga AI

expand

Fråga AI

ChatGPT

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Avsnitt 1. Kapitel 7
some-alt