Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
学ぶ Performing Hierarchical Clustering | Section
Statistical Visualization with Seaborn
セクション 1.  17
single

single

bookPerforming Hierarchical Clustering

メニューを表示するにはスワイプしてください

A clustermap is a matrix plot that combines a heatmap with hierarchical clustering.

While a standard heatmap displays data in a fixed grid, a clustermap reorders the rows and columns to place similar values next to each other. The tree-like diagrams on the axes are called dendrograms, and they show how the data points are grouped.

Key Parameters

To control how the clustering works, you can use these parameters:

  • standard_scale: standardizes the data (0 for rows, 1 for columns) so that each feature has a mean of 0 and variance of 1. This is crucial when variables have different units;
  • metric: the distance measure to use (e.g., 'euclidean', 'correlation'). It determines what "similar" means;
  • method: the linkage algorithm to use (e.g., 'single', 'complete', 'average'). It determines how to group the clusters.

Example

Here is a clustermap of the Iris dataset. Notice how the species (rows) are automatically grouped together because they have similar measurements.

12345678910111213141516171819
import seaborn as sns import matplotlib.pyplot as plt # Load dataset df = sns.load_dataset('iris') # Prepare matrix (drop non-numeric column for calculation) species = df.pop("species") # Create a clustermap sns.clustermap( data=df, standard_scale=1, # Normalize columns metric='euclidean', # Measure distance method='average', # clustering method cmap='viridis', figsize=(6, 6) ) plt.show()
copy
タスク

スワイプしてコーディングを開始

Analyze the flight passengers data to find similarities between years.

  1. Set the style to 'ticks'. Change the background color to 'seagreen' ('figure.facecolor').
  2. Create a clustermap using the reshaped upd_df DataFrame:
    • Pass upd_df as the data.
    • Normalize the columns by setting standard_scale to 1.
    • Use the 'single' clustering method.
    • Use 'correlation' as the distance metric.
    • Display values in cells (annot=True).
    • Set the value limits: vmin=0 and vmax=10.
    • Use the 'vlag' color map.
  3. Display the plot.

解答

Switch to desktop実践的な練習のためにデスクトップに切り替える下記のオプションのいずれかを利用して、現在の場所から続行する
すべて明確でしたか?

どのように改善できますか?

フィードバックありがとうございます!

セクション 1.  17
single

single

AIに質問する

expand

AIに質問する

ChatGPT

何でも質問するか、提案された質問の1つを試してチャットを始めてください

some-alt