Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Apprendre Finding Optimal Number of Clusters Using Silhouette Score | Section
Performing Cluster Analysis

bookFinding Optimal Number of Clusters Using Silhouette Score

Glissez pour afficher le menu

Besides the WSS method, the silhouette score is another valuable metric for determining the optimal number of clusters (K) in K-means. It evaluates how well each data point fits its cluster compared to others.

For each data point, the silhouette ccore considers:

  • Cohesion (a): average distance to points within its cluster;
  • Separation (b): average distance to points in the nearest other cluster.

The Silhouette Score is calculated as: (ba)/max(a,b)(b - a) / max(a, b), ranging from -1 to +1.

Score interpretation:

  • +1: point is well-clustered;
  • ~0: point is on the cluster boundary;
  • -1: point may be misclassified.

Steps to find optimal K using silhouette score are the following:

  • Run K-means for a range of K values (e.g., K=2 to a reasonable limit);
  • For each K, calculate the average Silhouette Score;
  • Plot average silhouette score vs. K (silhouette plot);
  • Choose K with the highest average silhouette score.

Examining the silhouette plot, which shows scores for each point, can offer deeper insights into cluster consistency. Higher average scores and consistent scores across points are desirable.

In summary, while WSS minimizes within-cluster distances, silhouette score balances cohesion and separation. Using both provides a more robust approach to finding the optimal K.

question mark

What does a high average silhouette score (close to +1) indicate when evaluating clustering results?

Sélectionnez la réponse correcte

Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 1. Chapitre 13

Demandez à l'IA

expand

Demandez à l'IA

ChatGPT

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

Section 1. Chapitre 13
some-alt