Вивчайте Confusion Matrix | Classification Metrics

A confusion matrix is a table that summarizes the performance of a classification model by comparing predicted and actual labels. For binary classification, it is typically shown as a 2x2 matrix:

\begin{bmatrix} \text{TP} & \text{FP} \\ \text{FN} & \text{TN} \end{bmatrix}

TP (True Positives): Number of positive samples correctly predicted as positive;
FP (False Positives): Number of negative samples incorrectly predicted as positive;
FN (False Negatives): Number of positive samples incorrectly predicted as negative;
TN (True Negatives): Number of negative samples correctly predicted as negative.

Rows represent the actual class, and columns represent the predicted class. This layout helps you quickly identify both correct predictions and types of errors.


              12345678910111213141516171819202122232425262728293031323334
            
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Load a dataset and prepare data
iris = load_iris()
X = iris.data
y = iris.target

# For simplicity, use only two classes
X_binary = X[y != 2]
y_binary = y[y != 2]

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X_binary, y_binary, test_size=0.3, random_state=42)

# Train a classifier
clf = LogisticRegression()
clf.fit(X_train, y_train)

# Predict on test set
y_pred = clf.predict(X_test)

# Compute confusion matrix
cm = confusion_matrix(y_test, y_pred)

# Visualize confusion matrix
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=clf.classes_)
disp.plot(cmap=plt.cm.Blues)
plt.title("Confusion Matrix")
plt.show()

Interpreting a confusion matrix helps you pinpoint where your classification model is performing well and where it is struggling. High values on the diagonal indicate correct predictions, while off-diagonal values highlight misclassifications. By examining which classes are most often confused, you can identify specific weaknesses, such as a tendency to produce more false positives or false negatives for a particular class. This insight is invaluable for refining your model, adjusting thresholds, or collecting more representative data to address the observed shortcomings.

Все було зрозуміло?

Дякуємо за ваш відгук!

Секція 1. Розділ 5

Запитати АІ

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

Suggested prompts:

Can you explain how to interpret each value in the confusion matrix?

What are some common metrics derived from the confusion matrix?

How can I use the confusion matrix to improve my model?

Свайпніть щоб показати меню

A confusion matrix is a table that summarizes the performance of a classification model by comparing predicted and actual labels. For binary classification, it is typically shown as a 2x2 matrix:

\begin{bmatrix} \text{TP} & \text{FP} \\ \text{FN} & \text{TN} \end{bmatrix}

TP (True Positives): Number of positive samples correctly predicted as positive;
FP (False Positives): Number of negative samples incorrectly predicted as positive;
FN (False Negatives): Number of positive samples incorrectly predicted as negative;
TN (True Negatives): Number of negative samples correctly predicted as negative.

Rows represent the actual class, and columns represent the predicted class. This layout helps you quickly identify both correct predictions and types of errors.


              12345678910111213141516171819202122232425262728293031323334
            
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Load a dataset and prepare data
iris = load_iris()
X = iris.data
y = iris.target

# For simplicity, use only two classes
X_binary = X[y != 2]
y_binary = y[y != 2]

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X_binary, y_binary, test_size=0.3, random_state=42)

# Train a classifier
clf = LogisticRegression()
clf.fit(X_train, y_train)

# Predict on test set
y_pred = clf.predict(X_test)

# Compute confusion matrix
cm = confusion_matrix(y_test, y_pred)

# Visualize confusion matrix
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=clf.classes_)
disp.plot(cmap=plt.cm.Blues)
plt.title("Confusion Matrix")
plt.show()

Все було зрозуміло?

Дякуємо за ваш відгук!

Секція 1. Розділ 5