Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn How Similar are the Results? | Hierarchical Clustering
Cluster Analysis in Python

book
How Similar are the Results?

Well done! Let's look at the last line charts you built in the previous chapter.

As you can see, only the ward linkage could catch the 'downward up to July' trend. Both results are different. But let's find out how different they are using the rand index.

Task

Swipe to start coding

Table

Compute the rand index to compare the results of using complete and ward linkages. Follow the next steps:

  1. Import functions needed:
  • rand_score from sklearn.metrics.
  • AgglomerativeClustering from sklearn.cluster.
  1. Create two models model_complete and model_ward performing a hierarchical clustering with 4 clusters both and 'complete' and 'ward' linkages respectively.
  2. Fit the 3-14 columns of data to models and predict the labels. Save the labels for model_complete within labels_complete and for model_ward within labels_ward.
  3. Compute the rand index using labels_complete and labels_ward.

Solution

# Import the libraries
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import rand_score
from sklearn.cluster import AgglomerativeClustering

# Read the data
data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/138ab9ad-aa37-4310-873f-0f62abafb038/Cities+weather.csv', index_col = 0)

# Creating the models
model_complete = AgglomerativeClustering(n_clusters = 4, linkage = 'complete')
model_ward = AgglomerativeClustering(n_clusters = 4, linkage = 'ward')

# Fitting and predicting the labels
labels_complete = model_complete.fit_predict(data.iloc[:,2:14])
labels_ward = model_ward.fit_predict(data.iloc[:,2:14])

# Compute the Rand index
rand_index = rand_score(labels_complete, labels_ward)
print(f"The rand index for complete and ward linkages models is {rand_index}")

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 3. Chapter 8
# Import the libraries
import pandas as pd
import matplotlib.pyplot as plt
___
___

# Read the data
data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/138ab9ad-aa37-4310-873f-0f62abafb038/Cities+weather.csv', index_col = 0)

# Creating the models
model_complete = ___(___, ___)
model_ward = ___(___, ___)

# Fitting and predicting the labels
labels_complete = ___.___(data.iloc[:,2:14])
labels_ward = ___.___(data.iloc[:,2:14])

# Compute the Rand index
rand_index = ___(___, ___)
print(f"The rand index for complete and ward linkages models is {rand_index}")
toggle bottom row
some-alt