学ぶ混同行列 | セクション

メニューを表示するにはスワイプしてください

混同行列は、分類モデルの性能を予測ラベルと実際のラベルを比較することで要約する表。

2クラス分類の場合、通常は2x2の行列として表される：

\begin{bmatrix} \text{TP} & \text{FP} \\ \text{FN} & \text{TN} \end{bmatrix}

TP (True Positives): 正例を正しく正例と予測したサンプル数
FP (False Positives): 負例を誤って正例と予測したサンプル数
FN (False Negatives): 正例を誤って負例と予測したサンプル数
TN (True Negatives): 負例を正しく負例と予測したサンプル数

行は実際のクラス、列は予測されたクラスを表す。このレイアウトにより、正しい予測と誤りの種類を素早く把握できる。


              12345678910111213141516171819202122232425262728293031323334
            
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Load a dataset and prepare data
iris = load_iris()
X = iris.data
y = iris.target

# For simplicity, use only two classes
X_binary = X[y != 2]
y_binary = y[y != 2]

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X_binary, y_binary, test_size=0.3, random_state=42)

# Train a classifier
clf = LogisticRegression()
clf.fit(X_train, y_train)

# Predict on test set
y_pred = clf.predict(X_test)

# Compute confusion matrix
cm = confusion_matrix(y_test, y_pred)

# Visualize confusion matrix
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=clf.classes_)
disp.plot(cmap=plt.cm.Blues)
plt.title("Confusion Matrix")
plt.show()

混同行列の解釈により、分類モデルがどこで良好に機能し、どこで課題があるかを特定可能。対角線上の値が高い場合は正しい予測を示し、対角線外の値は誤分類を示す。どのクラスが最も混同されやすいかを調べることで、特定の弱点、例えば特定クラスに対する偽陽性や偽陰性の傾向を把握できる。この知見は、モデルの改善、閾値の調整、またはより代表的なデータ収集に役立つ。

すべて明確でしたか？

フィードバックありがとうございます！

セクション 1. 章 9

AIに質問する

何でも質問するか、提案された質問の1つを試してチャットを始めてください

セクション 1. 章 9