Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
course content

Зміст курсу

Clustering Demystified

How many Clusters?How many Clusters?

You may be wondering: But hey, what is the exact number of clusters? We can use the so-called "elbow method".

The elbow method is a technique used to determine the optimal number of clusters in a k-means clustering algorithm. The method consists of plotting the explained variation as a function of the number of clusters and picking the elbow of the curve as the number of clusters to use. The "elbow" is the point of inflection on the curve where the explained variation begins to decrease at a slower rate. This point is considered the optimal number of clusters because adding more clusters will not significantly improve the explained variation.

Methods description

  • range(start, end): This generates a sequence of numbers from start (inclusive) to end (exclusive), representing the range of possible cluster numbers to be tested;
  • kmeans.inertia_: This attribute of the KMeans object retrieves the inertia value calculated for the current clustering configuration;
  • cs: This is an empty list that will store the "inertia" values calculated for each number of clusters. Inertia represents the sum of squared distances of samples to their closest cluster center;
  • plt.plot(): This function from the matplotlib library (matplotlib.pyplot) is used to create a line plot. It plots the number of clusters on the x-axis against the corresponding inertia values (CS) on the y-axis;
  • plt.title(), plt.xlabel(), plt.ylabel(): These functions set the title, x-axis label, and y-axis label of the plot, respectively;
  • plt.show(): This function displays the plot.

Завдання

  1. Evaluate the kmeans from 1 to 10.
  2. Plot the graph.

Mark tasks as Completed

Все було зрозуміло?

Секція 1. Розділ 10
AVAILABLE TO ULTIMATE ONLY
course content

Зміст курсу

Clustering Demystified

How many Clusters?How many Clusters?

You may be wondering: But hey, what is the exact number of clusters? We can use the so-called "elbow method".

The elbow method is a technique used to determine the optimal number of clusters in a k-means clustering algorithm. The method consists of plotting the explained variation as a function of the number of clusters and picking the elbow of the curve as the number of clusters to use. The "elbow" is the point of inflection on the curve where the explained variation begins to decrease at a slower rate. This point is considered the optimal number of clusters because adding more clusters will not significantly improve the explained variation.

Methods description

  • range(start, end): This generates a sequence of numbers from start (inclusive) to end (exclusive), representing the range of possible cluster numbers to be tested;
  • kmeans.inertia_: This attribute of the KMeans object retrieves the inertia value calculated for the current clustering configuration;
  • cs: This is an empty list that will store the "inertia" values calculated for each number of clusters. Inertia represents the sum of squared distances of samples to their closest cluster center;
  • plt.plot(): This function from the matplotlib library (matplotlib.pyplot) is used to create a line plot. It plots the number of clusters on the x-axis against the corresponding inertia values (CS) on the y-axis;
  • plt.title(), plt.xlabel(), plt.ylabel(): These functions set the title, x-axis label, and y-axis label of the plot, respectively;
  • plt.show(): This function displays the plot.

Завдання

  1. Evaluate the kmeans from 1 to 10.
  2. Plot the graph.

Mark tasks as Completed

Все було зрозуміло?

Секція 1. Розділ 10
AVAILABLE TO ULTIMATE ONLY
some-alt