Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Perform K-means Clustering | Basic Clustering Algorithms
Cluster Analysis

book
Perform K-means Clustering

Oppgave

Swipe to start coding

Let's check the efficiency of the algorithm on different types of clusters. Now we will use the three built-in datasets of the sklearn library and try to use the K-means algorithm to cluster the corresponding points. We will provide visualizations and try to estimate the quality of clustering using these visualizations.

Your task is to use the K-means clustering algorithm and to solve 3 different clustering problems. Compare the results and make conclusions about clustering quality. You have to:

  1. Use KMeans class from cluster module for import.
  2. Use KMeans class to instantiate a class object
  3. Use.fit()method to train model.
  4. Use .labels_attribute to extract fitted clusters.

Once you've completed this task, click the button below the code to check your solution.

Løsning

from sklearn.datasets import make_blobs, make_moons, make_circles
import matplotlib.pyplot as plt
import numpy as np
from sklearn.cluster import KMeans
import warnings

warnings.filterwarnings('ignore')


def check_clustering_quality(X, y, n_clusters):
fig, axes = plt.subplots(1,2)
axes[0].scatter(X[:, 0], X[:, 1], c=y, cmap='tab20b')
axes[0].set_title('Real clusters')
kmeans=KMeans(n_clusters = n_clusters).fit(X)
axes[1].scatter(X[:, 0], X[:, 1], c=kmeans.labels_, cmap='tab20b')
axes[1].set_title('Clusters with K-means')
X, y = make_blobs(n_samples=500, cluster_std=1, centers=3)
check_clustering_quality(X, y, 3)

X, y = make_moons(n_samples=500)
check_clustering_quality(X, y, 2)

X, y = make_circles(n_samples=500)
check_clustering_quality(X, y, 2)

Note

In visualizations, it is necessary to look not at the color of clusters, but at the relative position of points in real and predicted clusters (Python can color the same clusters with different colors in different pictures due to implementation features)

Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 2. Kapittel 2
from sklearn.datasets import make_blobs, make_moons, make_circles
import matplotlib.pyplot as plt
import numpy as np
from sklearn.___ import ___

# We will use this function to fit KMeans on different datasets and to plot results
def check_clustering_quality(X, y, n_clusters):
fig, axes = plt.subplots(1, 2)
axes[0].scatter(X[:, 0], X[:, 1], c = y, cmap='tab20b')
axes[0].set_title('Real clusters')
kmeans = ___(n_clusters=n_clusters).___(X)
axes[1].scatter(X[:, 0], X[:, 1], c=kmeans.___, cmap='tab20b')
axes[1].set_title('Clusters with K-means')
# Now let's use check_clustering_quality function on blobs, moons and circles datasets
X, y = make_blobs(n_samples=500, cluster_std=1, centers=3)
check_clustering_quality(X, y, 3)

X, y = make_moons(n_samples=500)
check_clustering_quality(X, y, 2)

X, y = make_circles(n_samples=500)
check_clustering_quality(X, y, 2)

Spør AI

expand
ChatGPT

Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår

some-alt