Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Вивчайте Accuracy, Precision, and Recall | Classification Metrics
Evaluation Metrics in Machine Learning

bookAccuracy, Precision, and Recall

To evaluate classification models, you need clear definitions of accuracy, precision, and recall. These metrics are based on the confusion matrix, which summarizes the counts of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). The mathematical formulas for these metrics are as follows:

  • Accuracy measures the overall proportion of correct predictions:
Accuracy=TP+TNTP+TN+FP+FN\text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN}
  • Precision (also called positive predictive value) measures the proportion of positive predictions that are actually correct:
Precision=TPTP+FP\text{Precision} = \frac{TP}{TP + FP}
  • Recall (also called sensitivity or true positive rate) measures the proportion of actual positives that were correctly identified:
Recall=TPTP+FN \text{Recall} = \frac{TP}{TP + FN}

Each metric emphasizes a different aspect of model performance, so choosing the right metric depends on your specific goals and the problem context.

In practice, you should prioritize accuracy when the classes are balanced and the costs of false positives and false negatives are similar. For example, in image classification where all categories are equally important, accuracy provides a straightforward summary of model performance.

Precision is crucial when the cost of a false positive is high. For instance, in email spam detection, you want to minimize the chance of marking a legitimate message as spam (a false positive), so high precision is preferred.

Recall becomes important when missing a positive case is costly. In medical diagnostics, such as cancer screening, it is better to catch as many actual cases as possible, even if some negatives are incorrectly flagged. Here, maximizing recall ensures fewer actual positives are missed.

Understanding when to focus on each metric helps you align model evaluation with real-world objectives and risks.

123456789101112131415161718192021
# Manually compute accuracy, precision, and recall from confusion matrix values # Example confusion matrix values TP = 70 # True Positives TN = 50 # True Negatives FP = 10 # False Positives FN = 20 # False Negatives # Accuracy calculation accuracy = (TP + TN) / (TP + TN + FP + FN) # Precision calculation precision = TP / (TP + FP) if (TP + FP) > 0 else 0 # Recall calculation recall = TP / (TP + FN) if (TP + FN) > 0 else 0 print(f"Accuracy: {accuracy:.2f}") print(f"Precision: {precision:.2f}") print(f"Recall: {recall:.2f}")
copy
question mark

Which metric should you prioritize when building an email spam filter, where it is important to avoid marking legitimate emails as spam?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 1. Розділ 2

Запитати АІ

expand

Запитати АІ

ChatGPT

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

Awesome!

Completion rate improved to 6.25

bookAccuracy, Precision, and Recall

Свайпніть щоб показати меню

To evaluate classification models, you need clear definitions of accuracy, precision, and recall. These metrics are based on the confusion matrix, which summarizes the counts of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). The mathematical formulas for these metrics are as follows:

  • Accuracy measures the overall proportion of correct predictions:
Accuracy=TP+TNTP+TN+FP+FN\text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN}
  • Precision (also called positive predictive value) measures the proportion of positive predictions that are actually correct:
Precision=TPTP+FP\text{Precision} = \frac{TP}{TP + FP}
  • Recall (also called sensitivity or true positive rate) measures the proportion of actual positives that were correctly identified:
Recall=TPTP+FN \text{Recall} = \frac{TP}{TP + FN}

Each metric emphasizes a different aspect of model performance, so choosing the right metric depends on your specific goals and the problem context.

In practice, you should prioritize accuracy when the classes are balanced and the costs of false positives and false negatives are similar. For example, in image classification where all categories are equally important, accuracy provides a straightforward summary of model performance.

Precision is crucial when the cost of a false positive is high. For instance, in email spam detection, you want to minimize the chance of marking a legitimate message as spam (a false positive), so high precision is preferred.

Recall becomes important when missing a positive case is costly. In medical diagnostics, such as cancer screening, it is better to catch as many actual cases as possible, even if some negatives are incorrectly flagged. Here, maximizing recall ensures fewer actual positives are missed.

Understanding when to focus on each metric helps you align model evaluation with real-world objectives and risks.

123456789101112131415161718192021
# Manually compute accuracy, precision, and recall from confusion matrix values # Example confusion matrix values TP = 70 # True Positives TN = 50 # True Negatives FP = 10 # False Positives FN = 20 # False Negatives # Accuracy calculation accuracy = (TP + TN) / (TP + TN + FP + FN) # Precision calculation precision = TP / (TP + FP) if (TP + FP) > 0 else 0 # Recall calculation recall = TP / (TP + FN) if (TP + FN) > 0 else 0 print(f"Accuracy: {accuracy:.2f}") print(f"Precision: {precision:.2f}") print(f"Recall: {recall:.2f}")
copy
question mark

Which metric should you prioritize when building an email spam filter, where it is important to avoid marking legitimate emails as spam?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 1. Розділ 2
some-alt