Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer Evaluating Classification Models | Section
Predictive Modeling with Tidymodels in R

bookEvaluating Classification Models

Veeg om het menu te tonen

Evaluating classification models requires understanding several key metrics that reveal different aspects of model performance. The most common metrics include accuracy, which measures the proportion of correct predictions out of all predictions; precision, which is the ratio of true positives to the sum of true positives and false positives; recall, which is the ratio of true positives to the sum of true positives and false negatives; and ROC AUC, which summarizes the trade-off between sensitivity and specificity across different thresholds. These metrics help you judge how well your classification model distinguishes between classes and how it might perform in real-world scenarios.

1234567891011121314151617181920212223242526272829303132333435363738
options(crayon.enabled = FALSE) library(tidymodels) # 1. Simulate the prediction results (what collect_predictions usually returns) set.seed(42) predictions_df <- tibble( # True values (classes) target = factor(sample(c("Class1", "Class2"), 100, replace = TRUE)), # Probability for Class1 (random numbers between 0 and 1) .pred_Class1 = runif(100) ) %>% mutate( # Probability for Class2 .pred_Class2 = 1 - .pred_Class1, # Final predicted class (Class1 if prob > 0.5, else Class2) .pred_class = factor(ifelse(.pred_Class1 > 0.5, "Class1", "Class2"), levels = c("Class1", "Class2")) ) # 2. Generate the confusion matrix conf_mat_res <- predictions_df %>% conf_mat(truth = target, estimate = .pred_class) print("--- Confusion Matrix ---") print(conf_mat_res) # 3. Generate ROC curve and plot it roc_res <- predictions_df %>% roc_curve(truth = target, .pred_Class1) %>% autoplot() print(roc_res) # 4. Calculate ROC AUC roc_auc_res <- predictions_df %>% roc_auc(truth = target, .pred_Class1) print("--- ROC AUC ---") print(roc_auc_res)
copy

When you interpret classification metrics, consider the context and the consequences of different types of errors. High accuracy may not always mean good performance, especially with imbalanced classes. Precision is crucial when the cost of a false positive is high, while recall is more important when missing a positive instance is costly. The ROC curve helps visualize the model's ability to discriminate between classes at various thresholds, and the AUC provides a single-value summary of this discrimination. Comparing models using these metrics allows you to select the most suitable one for your specific application, balancing trade-offs between precision, recall, and overall accuracy.

question mark

A binary classifier predicts 90 positive and 10 negative cases. The true labels are 80 positive and 20 negative. The confusion matrix shows:

  • 75 true positives
  • 5 false positives
  • 15 false negatives
  • 5 true negatives

What is the accuracy of this model?

Selecteer het correcte antwoord

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 1. Hoofdstuk 7

Vraag AI

expand

Vraag AI

ChatGPT

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

Sectie 1. Hoofdstuk 7
some-alt