Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Impara Accuracy, Precision, and Recall | Classification Metrics
Evaluation Metrics in Machine Learning

bookAccuracy, Precision, and Recall

To evaluate classification models, you need clear definitions of accuracy, precision, and recall. These metrics are based on the confusion matrix, which summarizes the counts of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). The mathematical formulas for these metrics are as follows:

  • Accuracy measures the overall proportion of correct predictions:
Accuracy=TP+TNTP+TN+FP+FN\text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN}
  • Precision (also called positive predictive value) measures the proportion of positive predictions that are actually correct:
Precision=TPTP+FP\text{Precision} = \frac{TP}{TP + FP}
  • Recall (also called sensitivity or true positive rate) measures the proportion of actual positives that were correctly identified:
Recall=TPTP+FN \text{Recall} = \frac{TP}{TP + FN}

Each metric emphasizes a different aspect of model performance, so choosing the right metric depends on your specific goals and the problem context.

In practice, you should prioritize accuracy when the classes are balanced and the costs of false positives and false negatives are similar. For example, in image classification where all categories are equally important, accuracy provides a straightforward summary of model performance.

Precision is crucial when the cost of a false positive is high. For instance, in email spam detection, you want to minimize the chance of marking a legitimate message as spam (a false positive), so high precision is preferred.

Recall becomes important when missing a positive case is costly. In medical diagnostics, such as cancer screening, it is better to catch as many actual cases as possible, even if some negatives are incorrectly flagged. Here, maximizing recall ensures fewer actual positives are missed.

Understanding when to focus on each metric helps you align model evaluation with real-world objectives and risks.

123456789101112131415161718192021
# Manually compute accuracy, precision, and recall from confusion matrix values # Example confusion matrix values TP = 70 # True Positives TN = 50 # True Negatives FP = 10 # False Positives FN = 20 # False Negatives # Accuracy calculation accuracy = (TP + TN) / (TP + TN + FP + FN) # Precision calculation precision = TP / (TP + FP) if (TP + FP) > 0 else 0 # Recall calculation recall = TP / (TP + FN) if (TP + FN) > 0 else 0 print(f"Accuracy: {accuracy:.2f}") print(f"Precision: {precision:.2f}") print(f"Recall: {recall:.2f}")
copy
question mark

Which metric should you prioritize when building an email spam filter, where it is important to avoid marking legitimate emails as spam?

Select the correct answer

Tutto è chiaro?

Come possiamo migliorarlo?

Grazie per i tuoi commenti!

Sezione 1. Capitolo 2

Chieda ad AI

expand

Chieda ad AI

ChatGPT

Chieda pure quello che desidera o provi una delle domande suggerite per iniziare la nostra conversazione

Suggested prompts:

Can you explain the difference between precision and recall with more examples?

When should I prioritize recall over precision in real-world scenarios?

Can you help me interpret the output values from the code sample?

Awesome!

Completion rate improved to 6.25

bookAccuracy, Precision, and Recall

Scorri per mostrare il menu

To evaluate classification models, you need clear definitions of accuracy, precision, and recall. These metrics are based on the confusion matrix, which summarizes the counts of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). The mathematical formulas for these metrics are as follows:

  • Accuracy measures the overall proportion of correct predictions:
Accuracy=TP+TNTP+TN+FP+FN\text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN}
  • Precision (also called positive predictive value) measures the proportion of positive predictions that are actually correct:
Precision=TPTP+FP\text{Precision} = \frac{TP}{TP + FP}
  • Recall (also called sensitivity or true positive rate) measures the proportion of actual positives that were correctly identified:
Recall=TPTP+FN \text{Recall} = \frac{TP}{TP + FN}

Each metric emphasizes a different aspect of model performance, so choosing the right metric depends on your specific goals and the problem context.

In practice, you should prioritize accuracy when the classes are balanced and the costs of false positives and false negatives are similar. For example, in image classification where all categories are equally important, accuracy provides a straightforward summary of model performance.

Precision is crucial when the cost of a false positive is high. For instance, in email spam detection, you want to minimize the chance of marking a legitimate message as spam (a false positive), so high precision is preferred.

Recall becomes important when missing a positive case is costly. In medical diagnostics, such as cancer screening, it is better to catch as many actual cases as possible, even if some negatives are incorrectly flagged. Here, maximizing recall ensures fewer actual positives are missed.

Understanding when to focus on each metric helps you align model evaluation with real-world objectives and risks.

123456789101112131415161718192021
# Manually compute accuracy, precision, and recall from confusion matrix values # Example confusion matrix values TP = 70 # True Positives TN = 50 # True Negatives FP = 10 # False Positives FN = 20 # False Negatives # Accuracy calculation accuracy = (TP + TN) / (TP + TN + FP + FN) # Precision calculation precision = TP / (TP + FP) if (TP + FP) > 0 else 0 # Recall calculation recall = TP / (TP + FN) if (TP + FN) > 0 else 0 print(f"Accuracy: {accuracy:.2f}") print(f"Precision: {precision:.2f}") print(f"Recall: {recall:.2f}")
copy
question mark

Which metric should you prioritize when building an email spam filter, where it is important to avoid marking legitimate emails as spam?

Select the correct answer

Tutto è chiaro?

Come possiamo migliorarlo?

Grazie per i tuoi commenti!

Sezione 1. Capitolo 2
some-alt