Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer Visualizing Bivariate Distributions with KDE, Jointplots, and Hexbin Plots | Bivariate and Correlation Analysis
Exploratory Data Analysis with Python

bookVisualizing Bivariate Distributions with KDE, Jointplots, and Hexbin Plots

When examining retail data, you often need to understand how two numerical features relate to each other. Advanced visualization techniques such as jointplots and hexbin plots are powerful tools for this kind of bivariate analysis.

A jointplot combines scatterplots, histograms, and kernel density estimation (KDE) to reveal both the joint distribution and the marginal distributions of two variables. This helps you spot correlations, clusters, and outliers.

A hexbin plot is especially useful for large datasets. It groups data points into hexagonal bins and colors them by frequency, making dense regions and patterns more apparent.

Suppose you are analyzing a retail dataset with features like price and discount. You want to visualize how discounts are distributed with respect to price, and whether there are any visible trends or groupings.

123456789101112131415
import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # Sample retail data data = { "price": [10, 12, 15, 20, 22, 23, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90], "discount": [1, 2, 2, 3, 3, 4, 4, 5, 6, 5, 7, 8, 7, 9, 10, 9, 11, 12, 11, 13] } df = pd.DataFrame(data) # Create a jointplot for price vs. discount sns.jointplot(data=df, x="price", y="discount", kind="kde", fill=True, cmap="Blues") plt.suptitle("Joint Distribution of Price and Discount (KDE)", y=1.02) plt.show()
copy
1234567891011121314
import numpy as np import matplotlib.pyplot as plt # Use the same data as above x = df["price"] y = df["discount"] plt.figure(figsize=(6, 5)) plt.hexbin(x, y, gridsize=10, cmap="Blues", edgecolors="gray") plt.colorbar(label="Count in bin") plt.xlabel("Price") plt.ylabel("Discount") plt.title("Hexbin Plot of Price vs. Discount") plt.show()
copy

Both jointplots and hexbin plots help you explore the relationship between two numerical features, but they serve slightly different purposes.

  • A jointplot with KDE:

    • Provides a smooth estimate of the joint probability density;
    • Makes it easier to see general trends, clusters, and the spread of data—even with overlapping points;
    • Displays the marginal distributions, giving you additional context about each variable individually.
  • A hexbin plot:

    • Is especially effective when you have a large number of data points;
    • Reduces overplotting by aggregating points into hexagonal bins;
    • Helps you quickly spot dense areas and potential linear or nonlinear relationships.

Jointplots are more informative for smaller datasets or when you want to examine distribution shapes. Hexbin plots excel at revealing patterns in larger or more complex datasets where scatterplots would become unreadable.

question mark

What is a key difference between a jointplot with KDE and a hexbin plot when visualizing two numerical variables?

Select the correct answer

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 3. Hoofdstuk 3

Vraag AI

expand

Vraag AI

ChatGPT

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

Suggested prompts:

Can you explain when to use a jointplot versus a hexbin plot?

What are some best practices for interpreting these plots?

Can you suggest other visualization techniques for bivariate analysis?

Awesome!

Completion rate improved to 5.56

bookVisualizing Bivariate Distributions with KDE, Jointplots, and Hexbin Plots

Veeg om het menu te tonen

When examining retail data, you often need to understand how two numerical features relate to each other. Advanced visualization techniques such as jointplots and hexbin plots are powerful tools for this kind of bivariate analysis.

A jointplot combines scatterplots, histograms, and kernel density estimation (KDE) to reveal both the joint distribution and the marginal distributions of two variables. This helps you spot correlations, clusters, and outliers.

A hexbin plot is especially useful for large datasets. It groups data points into hexagonal bins and colors them by frequency, making dense regions and patterns more apparent.

Suppose you are analyzing a retail dataset with features like price and discount. You want to visualize how discounts are distributed with respect to price, and whether there are any visible trends or groupings.

123456789101112131415
import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # Sample retail data data = { "price": [10, 12, 15, 20, 22, 23, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90], "discount": [1, 2, 2, 3, 3, 4, 4, 5, 6, 5, 7, 8, 7, 9, 10, 9, 11, 12, 11, 13] } df = pd.DataFrame(data) # Create a jointplot for price vs. discount sns.jointplot(data=df, x="price", y="discount", kind="kde", fill=True, cmap="Blues") plt.suptitle("Joint Distribution of Price and Discount (KDE)", y=1.02) plt.show()
copy
1234567891011121314
import numpy as np import matplotlib.pyplot as plt # Use the same data as above x = df["price"] y = df["discount"] plt.figure(figsize=(6, 5)) plt.hexbin(x, y, gridsize=10, cmap="Blues", edgecolors="gray") plt.colorbar(label="Count in bin") plt.xlabel("Price") plt.ylabel("Discount") plt.title("Hexbin Plot of Price vs. Discount") plt.show()
copy

Both jointplots and hexbin plots help you explore the relationship between two numerical features, but they serve slightly different purposes.

  • A jointplot with KDE:

    • Provides a smooth estimate of the joint probability density;
    • Makes it easier to see general trends, clusters, and the spread of data—even with overlapping points;
    • Displays the marginal distributions, giving you additional context about each variable individually.
  • A hexbin plot:

    • Is especially effective when you have a large number of data points;
    • Reduces overplotting by aggregating points into hexagonal bins;
    • Helps you quickly spot dense areas and potential linear or nonlinear relationships.

Jointplots are more informative for smaller datasets or when you want to examine distribution shapes. Hexbin plots excel at revealing patterns in larger or more complex datasets where scatterplots would become unreadable.

question mark

What is a key difference between a jointplot with KDE and a hexbin plot when visualizing two numerical variables?

Select the correct answer

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 3. Hoofdstuk 3
some-alt