Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Perform Agglomerative Clustering | Basic Clustering Algorithms
Cluster Analysis

book
Perform Agglomerative Clustering

Task

Swipe to start coding

Your task is to use different linkage types and to look at the performance of agglomerative clustering on moons and circles datasets. You have to:

  1. Import AgglomerativeClustering class from sklearn.cluster module.
  2. Add a parameter with the name linkage as an input of the function.
  3. Add .fit() method of the agglomerative object to train the model.
  4. Use 'single', 'complete', and 'average' as parameters of the function(parameters in the code have to be used in the same order).

Solution

from sklearn.datasets import make_moons, make_circles
import matplotlib.pyplot as plt
import numpy as np
from sklearn.cluster import AgglomerativeClustering

def check_linkage_parameter(X, y, ds_name, linkage):
agglomerative = AgglomerativeClustering(linkage=linkage,
distance_threshold=0.5, n_clusters=None)
agglomerative.fit(X)
fig, axes = plt.subplots(1, 2)
fig.suptitle(ds_name+' dataset: '+ str(linkage)+' linkage')
axes[0].scatter(X[:, 0], X[:, 1], c=y, cmap='tab20b')
axes[0].set_title('Real clusters')
axes[1].scatter(X[:, 0], X[:, 1], c=agglomerative.labels_, cmap='tab20b')
axes[1].set_title('Clusters with Agglomerative')
X, y = make_moons(n_samples=500)
check_linkage_parameter(X, y, 'Moons', linkage='single')
check_linkage_parameter(X, y, 'Moons', linkage='complete')
check_linkage_parameter(X, y, 'Moons', linkage='average')

X, y = make_circles(n_samples=500)
check_linkage_parameter(X, y, 'Circles', linkage='single')
check_linkage_parameter(X, y, 'Circles', linkage='complete')
check_linkage_parameter(X, y, 'Circles', linkage='average')

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 2. Chapter 4
from sklearn.datasets import make_moons, make_circles
import matplotlib.pyplot as plt
import numpy as np
from sklearn.___ import ___

# this function will train agglomerative model with different linakges and plot the results
def check_linkage_parameter(X, y, ds_name, ___):
agglomerative = AgglomerativeClustering(linkage=linkage,
distance_threshold=0.5, n_clusters=None)
agglomerative.___(X)
fig, axes = plt.subplots(1, 2)
fig.suptitle(ds_name+' Dataset: '+str(linkage)+' linkage')
axes[0].scatter(X[:, 0], X[:,1 ], c=y, cmap='tab20b')
axes[0].set_title('Real clusters')
axes[1].scatter(X[:, 0], X[:, 1], c=agglomerative.labels_, cmap='tab20b')
axes[1].set_title('Clusters with Agglomerative')
# Check clustering quality on moons dataset
X, y = make_moons(n_samples=500)
check_linkage_parameter(X, y, 'Moons', linkage=___)
check_linkage_parameter(X, y, 'Moons', linkage=___)
check_linkage_parameter(X, y, 'Moons', linkage=___)
# Check clustering quality on circles dataset
X, y = make_circles(n_samples=500)
check_linkage_parameter(X, y, 'Circles', linkage='single')
check_linkage_parameter(X, y, 'Circles', linkage='complete')
check_linkage_parameter(X, y, 'Circles', linkage='average')
toggle bottom row
some-alt