Confusion Matrix
A confusion matrix is a table that summarizes the performance of a classification model by comparing predicted and actual labels. For binary classification, it is typically shown as a 2x2 matrix:
[TPFNFPTN]- TP (True Positives): Number of positive samples correctly predicted as positive;
- FP (False Positives): Number of negative samples incorrectly predicted as positive;
- FN (False Negatives): Number of positive samples incorrectly predicted as negative;
- TN (True Negatives): Number of negative samples correctly predicted as negative.
Rows represent the actual class, and columns represent the predicted class. This layout helps you quickly identify both correct predictions and types of errors.
12345678910111213141516171819202122232425262728293031323334import numpy as np import matplotlib.pyplot as plt from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression # Load a dataset and prepare data iris = load_iris() X = iris.data y = iris.target # For simplicity, use only two classes X_binary = X[y != 2] y_binary = y[y != 2] # Split into train and test sets X_train, X_test, y_train, y_test = train_test_split(X_binary, y_binary, test_size=0.3, random_state=42) # Train a classifier clf = LogisticRegression() clf.fit(X_train, y_train) # Predict on test set y_pred = clf.predict(X_test) # Compute confusion matrix cm = confusion_matrix(y_test, y_pred) # Visualize confusion matrix disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=clf.classes_) disp.plot(cmap=plt.cm.Blues) plt.title("Confusion Matrix") plt.show()
Interpreting a confusion matrix helps you pinpoint where your classification model is performing well and where it is struggling. High values on the diagonal indicate correct predictions, while off-diagonal values highlight misclassifications. By examining which classes are most often confused, you can identify specific weaknesses, such as a tendency to produce more false positives or false negatives for a particular class. This insight is invaluable for refining your model, adjusting thresholds, or collecting more representative data to address the observed shortcomings.
Дякуємо за ваш відгук!
Запитати АІ
Запитати АІ
Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат
Awesome!
Completion rate improved to 6.25
Confusion Matrix
Свайпніть щоб показати меню
A confusion matrix is a table that summarizes the performance of a classification model by comparing predicted and actual labels. For binary classification, it is typically shown as a 2x2 matrix:
[TPFNFPTN]- TP (True Positives): Number of positive samples correctly predicted as positive;
- FP (False Positives): Number of negative samples incorrectly predicted as positive;
- FN (False Negatives): Number of positive samples incorrectly predicted as negative;
- TN (True Negatives): Number of negative samples correctly predicted as negative.
Rows represent the actual class, and columns represent the predicted class. This layout helps you quickly identify both correct predictions and types of errors.
12345678910111213141516171819202122232425262728293031323334import numpy as np import matplotlib.pyplot as plt from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression # Load a dataset and prepare data iris = load_iris() X = iris.data y = iris.target # For simplicity, use only two classes X_binary = X[y != 2] y_binary = y[y != 2] # Split into train and test sets X_train, X_test, y_train, y_test = train_test_split(X_binary, y_binary, test_size=0.3, random_state=42) # Train a classifier clf = LogisticRegression() clf.fit(X_train, y_train) # Predict on test set y_pred = clf.predict(X_test) # Compute confusion matrix cm = confusion_matrix(y_test, y_pred) # Visualize confusion matrix disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=clf.classes_) disp.plot(cmap=plt.cm.Blues) plt.title("Confusion Matrix") plt.show()
Interpreting a confusion matrix helps you pinpoint where your classification model is performing well and where it is struggling. High values on the diagonal indicate correct predictions, while off-diagonal values highlight misclassifications. By examining which classes are most often confused, you can identify specific weaknesses, such as a tendency to produce more false positives or false negatives for a particular class. This insight is invaluable for refining your model, adjusting thresholds, or collecting more representative data to address the observed shortcomings.
Дякуємо за ваш відгук!