Performing Hierarchical Clustering
A clustermap is a matrix plot that combines a heatmap with hierarchical clustering.
While a standard heatmap displays data in a fixed grid, a clustermap reorders the rows and columns to place similar values next to each other. The tree-like diagrams on the axes are called dendrograms, and they show how the data points are grouped.
Key Parameters
To control how the clustering works, you can use these parameters:
standard_scale: standardizes the data (0 for rows, 1 for columns) so that each feature has a mean of 0 and variance of 1. This is crucial when variables have different units;metric: the distance measure to use (e.g.,'euclidean','correlation'). It determines what "similar" means;method: the linkage algorithm to use (e.g.,'single','complete','average'). It determines how to group the clusters.
Example
Here is a clustermap of the Iris dataset. Notice how the species (rows) are automatically grouped together because they have similar measurements.
12345678910111213141516171819import seaborn as sns import matplotlib.pyplot as plt # Load dataset df = sns.load_dataset('iris') # Prepare matrix (drop non-numeric column for calculation) species = df.pop("species") # Create a clustermap sns.clustermap( data=df, standard_scale=1, # Normalize columns metric='euclidean', # Measure distance method='average', # clustering method cmap='viridis', figsize=(6, 6) ) plt.show()
Swipe to start coding
Analyze the flight passengers data to find similarities between years.
- Set the style to
'ticks'. Change the background color to'seagreen'('figure.facecolor'). - Create a clustermap using the reshaped
upd_dfDataFrame:- Pass
upd_dfas the data. - Normalize the columns by setting
standard_scaleto1. - Use the
'single'clusteringmethod. - Use
'correlation'as the distancemetric. - Display values in cells (
annot=True). - Set the value limits:
vmin=0andvmax=10. - Use the
'vlag'color map.
- Pass
- Display the plot.
Solution
Thanks for your feedback!
single
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
Awesome!
Completion rate improved to 4.55
Performing Hierarchical Clustering
Swipe to show menu
A clustermap is a matrix plot that combines a heatmap with hierarchical clustering.
While a standard heatmap displays data in a fixed grid, a clustermap reorders the rows and columns to place similar values next to each other. The tree-like diagrams on the axes are called dendrograms, and they show how the data points are grouped.
Key Parameters
To control how the clustering works, you can use these parameters:
standard_scale: standardizes the data (0 for rows, 1 for columns) so that each feature has a mean of 0 and variance of 1. This is crucial when variables have different units;metric: the distance measure to use (e.g.,'euclidean','correlation'). It determines what "similar" means;method: the linkage algorithm to use (e.g.,'single','complete','average'). It determines how to group the clusters.
Example
Here is a clustermap of the Iris dataset. Notice how the species (rows) are automatically grouped together because they have similar measurements.
12345678910111213141516171819import seaborn as sns import matplotlib.pyplot as plt # Load dataset df = sns.load_dataset('iris') # Prepare matrix (drop non-numeric column for calculation) species = df.pop("species") # Create a clustermap sns.clustermap( data=df, standard_scale=1, # Normalize columns metric='euclidean', # Measure distance method='average', # clustering method cmap='viridis', figsize=(6, 6) ) plt.show()
Swipe to start coding
Analyze the flight passengers data to find similarities between years.
- Set the style to
'ticks'. Change the background color to'seagreen'('figure.facecolor'). - Create a clustermap using the reshaped
upd_dfDataFrame:- Pass
upd_dfas the data. - Normalize the columns by setting
standard_scaleto1. - Use the
'single'clusteringmethod. - Use
'correlation'as the distancemetric. - Display values in cells (
annot=True). - Set the value limits:
vmin=0andvmax=10. - Use the
'vlag'color map.
- Pass
- Display the plot.
Solution
Thanks for your feedback!
single