Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära Visualizing Bivariate Distributions with KDE, Jointplots, and Hexbin Plots | Bivariate and Correlation Analysis
Exploratory Data Analysis with Python

bookVisualizing Bivariate Distributions with KDE, Jointplots, and Hexbin Plots

When examining retail data, you often need to understand how two numerical features relate to each other. Advanced visualization techniques such as jointplots and hexbin plots are powerful tools for this kind of bivariate analysis.

A jointplot combines scatterplots, histograms, and kernel density estimation (KDE) to reveal both the joint distribution and the marginal distributions of two variables. This helps you spot correlations, clusters, and outliers.

A hexbin plot is especially useful for large datasets. It groups data points into hexagonal bins and colors them by frequency, making dense regions and patterns more apparent.

Suppose you are analyzing a retail dataset with features like price and discount. You want to visualize how discounts are distributed with respect to price, and whether there are any visible trends or groupings.

123456789101112131415
import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # Sample retail data data = { "price": [10, 12, 15, 20, 22, 23, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90], "discount": [1, 2, 2, 3, 3, 4, 4, 5, 6, 5, 7, 8, 7, 9, 10, 9, 11, 12, 11, 13] } df = pd.DataFrame(data) # Create a jointplot for price vs. discount sns.jointplot(data=df, x="price", y="discount", kind="kde", fill=True, cmap="Blues") plt.suptitle("Joint Distribution of Price and Discount (KDE)", y=1.02) plt.show()
copy
1234567891011121314
import numpy as np import matplotlib.pyplot as plt # Use the same data as above x = df["price"] y = df["discount"] plt.figure(figsize=(6, 5)) plt.hexbin(x, y, gridsize=10, cmap="Blues", edgecolors="gray") plt.colorbar(label="Count in bin") plt.xlabel("Price") plt.ylabel("Discount") plt.title("Hexbin Plot of Price vs. Discount") plt.show()
copy

Both jointplots and hexbin plots help you explore the relationship between two numerical features, but they serve slightly different purposes.

  • A jointplot with KDE:

    • Provides a smooth estimate of the joint probability density;
    • Makes it easier to see general trends, clusters, and the spread of data—even with overlapping points;
    • Displays the marginal distributions, giving you additional context about each variable individually.
  • A hexbin plot:

    • Is especially effective when you have a large number of data points;
    • Reduces overplotting by aggregating points into hexagonal bins;
    • Helps you quickly spot dense areas and potential linear or nonlinear relationships.

Jointplots are more informative for smaller datasets or when you want to examine distribution shapes. Hexbin plots excel at revealing patterns in larger or more complex datasets where scatterplots would become unreadable.

question mark

What is a key difference between a jointplot with KDE and a hexbin plot when visualizing two numerical variables?

Select the correct answer

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 3. Kapitel 3

Fråga AI

expand

Fråga AI

ChatGPT

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Suggested prompts:

Can you explain when to use a jointplot versus a hexbin plot?

What are some best practices for interpreting these plots?

Can you suggest other visualization techniques for bivariate analysis?

Awesome!

Completion rate improved to 5.56

bookVisualizing Bivariate Distributions with KDE, Jointplots, and Hexbin Plots

Svep för att visa menyn

When examining retail data, you often need to understand how two numerical features relate to each other. Advanced visualization techniques such as jointplots and hexbin plots are powerful tools for this kind of bivariate analysis.

A jointplot combines scatterplots, histograms, and kernel density estimation (KDE) to reveal both the joint distribution and the marginal distributions of two variables. This helps you spot correlations, clusters, and outliers.

A hexbin plot is especially useful for large datasets. It groups data points into hexagonal bins and colors them by frequency, making dense regions and patterns more apparent.

Suppose you are analyzing a retail dataset with features like price and discount. You want to visualize how discounts are distributed with respect to price, and whether there are any visible trends or groupings.

123456789101112131415
import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # Sample retail data data = { "price": [10, 12, 15, 20, 22, 23, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90], "discount": [1, 2, 2, 3, 3, 4, 4, 5, 6, 5, 7, 8, 7, 9, 10, 9, 11, 12, 11, 13] } df = pd.DataFrame(data) # Create a jointplot for price vs. discount sns.jointplot(data=df, x="price", y="discount", kind="kde", fill=True, cmap="Blues") plt.suptitle("Joint Distribution of Price and Discount (KDE)", y=1.02) plt.show()
copy
1234567891011121314
import numpy as np import matplotlib.pyplot as plt # Use the same data as above x = df["price"] y = df["discount"] plt.figure(figsize=(6, 5)) plt.hexbin(x, y, gridsize=10, cmap="Blues", edgecolors="gray") plt.colorbar(label="Count in bin") plt.xlabel("Price") plt.ylabel("Discount") plt.title("Hexbin Plot of Price vs. Discount") plt.show()
copy

Both jointplots and hexbin plots help you explore the relationship between two numerical features, but they serve slightly different purposes.

  • A jointplot with KDE:

    • Provides a smooth estimate of the joint probability density;
    • Makes it easier to see general trends, clusters, and the spread of data—even with overlapping points;
    • Displays the marginal distributions, giving you additional context about each variable individually.
  • A hexbin plot:

    • Is especially effective when you have a large number of data points;
    • Reduces overplotting by aggregating points into hexagonal bins;
    • Helps you quickly spot dense areas and potential linear or nonlinear relationships.

Jointplots are more informative for smaller datasets or when you want to examine distribution shapes. Hexbin plots excel at revealing patterns in larger or more complex datasets where scatterplots would become unreadable.

question mark

What is a key difference between a jointplot with KDE and a hexbin plot when visualizing two numerical variables?

Select the correct answer

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 3. Kapitel 3
some-alt