Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen Visualizing Bivariate Distributions with KDE, Jointplots, and Hexbin Plots | Bivariate and Correlation Analysis
Exploratory Data Analysis with Python

bookVisualizing Bivariate Distributions with KDE, Jointplots, and Hexbin Plots

When examining retail data, you often need to understand how two numerical features relate to each other. Advanced visualization techniques such as jointplots and hexbin plots are powerful tools for this kind of bivariate analysis.

A jointplot combines scatterplots, histograms, and kernel density estimation (KDE) to reveal both the joint distribution and the marginal distributions of two variables. This helps you spot correlations, clusters, and outliers.

A hexbin plot is especially useful for large datasets. It groups data points into hexagonal bins and colors them by frequency, making dense regions and patterns more apparent.

Suppose you are analyzing a retail dataset with features like price and discount. You want to visualize how discounts are distributed with respect to price, and whether there are any visible trends or groupings.

123456789101112131415
import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # Sample retail data data = { "price": [10, 12, 15, 20, 22, 23, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90], "discount": [1, 2, 2, 3, 3, 4, 4, 5, 6, 5, 7, 8, 7, 9, 10, 9, 11, 12, 11, 13] } df = pd.DataFrame(data) # Create a jointplot for price vs. discount sns.jointplot(data=df, x="price", y="discount", kind="kde", fill=True, cmap="Blues") plt.suptitle("Joint Distribution of Price and Discount (KDE)", y=1.02) plt.show()
copy
1234567891011121314
import numpy as np import matplotlib.pyplot as plt # Use the same data as above x = df["price"] y = df["discount"] plt.figure(figsize=(6, 5)) plt.hexbin(x, y, gridsize=10, cmap="Blues", edgecolors="gray") plt.colorbar(label="Count in bin") plt.xlabel("Price") plt.ylabel("Discount") plt.title("Hexbin Plot of Price vs. Discount") plt.show()
copy

Both jointplots and hexbin plots help you explore the relationship between two numerical features, but they serve slightly different purposes.

  • A jointplot with KDE:

    • Provides a smooth estimate of the joint probability density;
    • Makes it easier to see general trends, clusters, and the spread of data—even with overlapping points;
    • Displays the marginal distributions, giving you additional context about each variable individually.
  • A hexbin plot:

    • Is especially effective when you have a large number of data points;
    • Reduces overplotting by aggregating points into hexagonal bins;
    • Helps you quickly spot dense areas and potential linear or nonlinear relationships.

Jointplots are more informative for smaller datasets or when you want to examine distribution shapes. Hexbin plots excel at revealing patterns in larger or more complex datasets where scatterplots would become unreadable.

question mark

What is a key difference between a jointplot with KDE and a hexbin plot when visualizing two numerical variables?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 3. Kapitel 3

Fragen Sie AI

expand

Fragen Sie AI

ChatGPT

Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen

Awesome!

Completion rate improved to 5.56

bookVisualizing Bivariate Distributions with KDE, Jointplots, and Hexbin Plots

Swipe um das Menü anzuzeigen

When examining retail data, you often need to understand how two numerical features relate to each other. Advanced visualization techniques such as jointplots and hexbin plots are powerful tools for this kind of bivariate analysis.

A jointplot combines scatterplots, histograms, and kernel density estimation (KDE) to reveal both the joint distribution and the marginal distributions of two variables. This helps you spot correlations, clusters, and outliers.

A hexbin plot is especially useful for large datasets. It groups data points into hexagonal bins and colors them by frequency, making dense regions and patterns more apparent.

Suppose you are analyzing a retail dataset with features like price and discount. You want to visualize how discounts are distributed with respect to price, and whether there are any visible trends or groupings.

123456789101112131415
import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # Sample retail data data = { "price": [10, 12, 15, 20, 22, 23, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90], "discount": [1, 2, 2, 3, 3, 4, 4, 5, 6, 5, 7, 8, 7, 9, 10, 9, 11, 12, 11, 13] } df = pd.DataFrame(data) # Create a jointplot for price vs. discount sns.jointplot(data=df, x="price", y="discount", kind="kde", fill=True, cmap="Blues") plt.suptitle("Joint Distribution of Price and Discount (KDE)", y=1.02) plt.show()
copy
1234567891011121314
import numpy as np import matplotlib.pyplot as plt # Use the same data as above x = df["price"] y = df["discount"] plt.figure(figsize=(6, 5)) plt.hexbin(x, y, gridsize=10, cmap="Blues", edgecolors="gray") plt.colorbar(label="Count in bin") plt.xlabel("Price") plt.ylabel("Discount") plt.title("Hexbin Plot of Price vs. Discount") plt.show()
copy

Both jointplots and hexbin plots help you explore the relationship between two numerical features, but they serve slightly different purposes.

  • A jointplot with KDE:

    • Provides a smooth estimate of the joint probability density;
    • Makes it easier to see general trends, clusters, and the spread of data—even with overlapping points;
    • Displays the marginal distributions, giving you additional context about each variable individually.
  • A hexbin plot:

    • Is especially effective when you have a large number of data points;
    • Reduces overplotting by aggregating points into hexagonal bins;
    • Helps you quickly spot dense areas and potential linear or nonlinear relationships.

Jointplots are more informative for smaller datasets or when you want to examine distribution shapes. Hexbin plots excel at revealing patterns in larger or more complex datasets where scatterplots would become unreadable.

question mark

What is a key difference between a jointplot with KDE and a hexbin plot when visualizing two numerical variables?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 3. Kapitel 3
some-alt