Summary  
This chapter covers generating a clustermap by applying hierarchical clustering to matrix data and visualizing the result with dendrograms. It explains how to adjust clustering through parameters like data scaling, distance metrics, and linkage methods.

General domain of usage  
Exploratory data analysis

Um `clustermap` é um gráfico de matriz que combina um **heatmap** com **agrupamento hierárquico**.

Enquanto um heatmap padrão exibe os dados em uma grade fixa, um clustermap **reordena** as linhas e colunas para posicionar valores semelhantes próximos uns dos outros. Os diagramas em forma de árvore nos eixos são chamados de **dendrogramas** e mostram como os pontos de dados são agrupados.



## Parâmetros Principais

Para controlar como o agrupamento funciona, utilize estes parâmetros:

* **`standard_scale`**: padroniza os dados (0 para linhas, 1 para colunas) para que cada característica tenha média 0 e variância 1. Isso é fundamental quando as variáveis possuem diferentes unidades;
* **`metric`**: a medida de distância a ser utilizada (por exemplo, `'euclidean'`, `'correlation'`). Determina o que significa "semelhante";
* **`method`**: o algoritmo de ligação a ser utilizado (por exemplo, `'single'`, `'complete'`, `'average'`). Determina como os agrupamentos são formados.



## Exemplo

Aqui está um clustermap do conjunto de dados Iris. Observe como as espécies (linhas) são agrupadas automaticamente porque apresentam medidas semelhantes.

import seaborn as sns
import matplotlib.pyplot as plt

# Load dataset
df = sns.load_dataset('iris')
# Prepare matrix (drop non-numeric column for calculation)
species = df.pop("species")

# Create a clustermap
sns.clustermap(
    data=df,
    standard_scale=1,    # Normalize columns
    metric='euclidean',  # Measure distance
    method='average',    # clustering method
    cmap='viridis',
    figsize=(6, 6)
)

plt.show()

import unittest
import importlib
import sys
import pandas as pd
from unittest.mock import patch, MagicMock

# Helper function to dynamically generate test names and assertions
def _dynamic_test(test_case, condition, success_message, failure_message):
    if condition:
        test_case._testMethodName = success_message
        test_case.assertTrue(True, success_message)
    else:
        test_case._testMethodName = failure_message
        test_case.fail(failure_message)

class TestUserCode(unittest.TestCase):

    def setUp(self):
        # Mocking read_csv to prevent network issues and ensure consistent data
        self.patcher_csv = patch('pandas.read_csv')
        self.mock_read_csv = self.patcher_csv.start()
        
        # Create a simple DataFrame suitable for pivoting
        self.mock_df = pd.DataFrame({
            'year': [1949, 1949, 1950, 1950],
            'month': ['Jan', 'Feb', 'Jan', 'Feb'],
            'passengers': [112, 118, 115, 126]
        })
        self.mock_read_csv.return_value = self.mock_df

    def tearDown(self):
        self.patcher_csv.stop()

    # Test custom style configuration
    def test_style_configuration(self):
        with patch('seaborn.clustermap'), patch('matplotlib.pyplot.show'):
            with patch('seaborn.set_style') as mock_style:
                if 'user_code' in sys.modules:
                    importlib.reload(sys.modules['user_code'])
                import user_code
            
            if not mock_style.called:
                 _dynamic_test(self, False, "", "Expected `sns.set_style()` to be called.")
                 return

            args, kwargs = mock_style.call_args
            style_name_correct = args[0] == 'ticks'
            
            # Check dictionary configuration for facecolor
            rc_params = args[1] if len(args) > 1 else kwargs.get('rc', {})
            facecolor_correct = rc_params.get('figure.facecolor') == 'seagreen'

            _dynamic_test(
                self,
                style_name_correct and facecolor_correct,
                "The style is set to 'ticks' with 'seagreen' background.",
                "Expected `sns.set_style('ticks', {'figure.facecolor': 'seagreen'})`."
            )

    # Test clustermap parameters
    def test_clustermap_params(self):
        with patch('seaborn.clustermap') as mock_cluster:
            with patch('matplotlib.pyplot.show'):
                if 'user_code' in sys.modules:
                    importlib.reload(sys.modules['user_code'])
                import user_code
            
            if not mock_cluster.called:
                _dynamic_test(self, False, "", "Expected `sns.clustermap()` to be used.")
                return

            args, kwargs = mock_cluster.call_args
            
            # 1. Check Data Binding
            # Since user_code pivots the data, we check if the first arg is the pivoted dataframe
            passed_data = args[0] if args else kwargs.get('data')
            expected_data = user_code.upd_df
            data_check = passed_data is expected_data

            # 2. Check Parameters
            cmap_check = kwargs.get('cmap') == 'vlag'
            scale_check = kwargs.get('standard_scale') == 1
            method_check = kwargs.get('method') == 'single'
            metric_check = kwargs.get('metric') == 'correlation'
            annot_check = kwargs.get('annot') is True
            vmin_check = kwargs.get('vmin') == 0
            vmax_check = kwargs.get('vmax') == 10

            _dynamic_test(
                self,
                all([data_check, cmap_check, scale_check, method_check, metric_check, annot_check, vmin_check, vmax_check]),
                "The `clustermap` parameters (`standard_scale`, `method`, `metric`, `vmin`, etc.) are configured correctly.",
                f"Expected `standard_scale=1`, `method='single'`, `metric='correlation'`, `vmin=0`. Got: {kwargs}"
            )

    # Test show
    def test_show_used(self):
        with patch('seaborn.clustermap'):
            with patch('matplotlib.pyplot.show') as mock_show:
                if 'user_code' in sys.modules:
                    importlib.reload(sys.modules['user_code'])
                import user_code

                _dynamic_test(
                    self,
                    mock_show.called,
                    "The `plt.show()` function is used.",
                    "Expected `plt.show()` to be used to display the plot."
                )

if __name__ == '__main__':
    unittest.main()

test_code.py

Explore o poder da visualização estatística de dados projetada para revelar padrões e relações. Utilize o Seaborn para criar gráficos de distribuição, mapas de calor e gráficos categóricos informativos com código mínimo. Aprenda a aplicar temas estéticos e paletas de cores que tornam dados estatísticos complexos acessíveis e fáceis de interpretar.

Realizando Agrupamento Hierárquico

Parâmetros Principais

Exemplo

Solução