Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen Visualizing Numerical Features: Histograms, KDE, Boxplots | Univariate Analysis
Exploratory Data Analysis with Python

bookVisualizing Numerical Features: Histograms, KDE, Boxplots

Understanding how numerical features are distributed is a key part of exploratory data analysis in retail datasets. Three essential visualization tools for this purpose are histograms, kernel density estimation (KDE) plots, and boxplots. Each method provides a different perspective on the shape, spread, and characteristics of your data. In retail analysis, these plots help you uncover trends in product prices, sales amounts, and transaction values, making it easier to spot patterns, skewness, or potential outliers that could affect business decisions.

12345678910111213141516
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns # Sample retail data data = pd.DataFrame({ "product_price": [9.99, 12.49, 7.99, 19.99, 14.99, 11.49, 16.99, 9.99, 21.99, 8.49, 15.49, 13.99] }) plt.figure(figsize=(6, 4)) sns.histplot(data["product_price"], bins=6, kde=False, color="skyblue", edgecolor="black") plt.title("Histogram of Product Prices") plt.xlabel("Product Price ($)") plt.ylabel("Frequency") plt.tight_layout() plt.show()
copy
1234567891011121314
import numpy as np # Simulated sales amounts for retail transactions sales_data = pd.DataFrame({ "sales_amount": np.random.gamma(shape=2.0, scale=20.0, size=100) }) plt.figure(figsize=(6, 4)) sns.kdeplot(sales_data["sales_amount"], fill=True, color="orange") plt.title("KDE Plot of Sales Amounts") plt.xlabel("Sales Amount ($)") plt.ylabel("Density") plt.tight_layout() plt.show()
copy
1234567891011
# Simulated transaction amounts with a few outliers transaction_data = pd.DataFrame({ "transaction_amount": [50, 60, 55, 52, 58, 54, 53, 56, 87, 51, 57, 59, 80] }) plt.figure(figsize=(4, 6)) sns.boxplot(y=transaction_data["transaction_amount"], color="lightgreen") plt.title("Boxplot of Transaction Amounts") plt.ylabel("Transaction Amount ($)") plt.tight_layout() plt.show()
copy

Each visualization helps you explore numerical features, but they serve different purposes in retail data analysis:

  • Histograms: show how frequently each value or range of values appears. You can quickly see where most product prices cluster and detect gaps or spikes;
  • KDE plots: provide a smoothed version of the distribution, making it easier to spot subtle peaks, tails, or overall shape in sales amounts;
  • Boxplots: summarize data using the median, quartiles, and highlight potential outliers. This makes it easy to spot unusually high or low transaction amounts that may need further investigation.

Use these plots together to get a complete picture of your numerical data's distribution and identify key patterns or outliers.

question mark

Which type of plot is most effective for visually identifying outliers in a retail dataset?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 2. Kapitel 1

Fragen Sie AI

expand

Fragen Sie AI

ChatGPT

Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen

Awesome!

Completion rate improved to 5.56

bookVisualizing Numerical Features: Histograms, KDE, Boxplots

Swipe um das Menü anzuzeigen

Understanding how numerical features are distributed is a key part of exploratory data analysis in retail datasets. Three essential visualization tools for this purpose are histograms, kernel density estimation (KDE) plots, and boxplots. Each method provides a different perspective on the shape, spread, and characteristics of your data. In retail analysis, these plots help you uncover trends in product prices, sales amounts, and transaction values, making it easier to spot patterns, skewness, or potential outliers that could affect business decisions.

12345678910111213141516
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns # Sample retail data data = pd.DataFrame({ "product_price": [9.99, 12.49, 7.99, 19.99, 14.99, 11.49, 16.99, 9.99, 21.99, 8.49, 15.49, 13.99] }) plt.figure(figsize=(6, 4)) sns.histplot(data["product_price"], bins=6, kde=False, color="skyblue", edgecolor="black") plt.title("Histogram of Product Prices") plt.xlabel("Product Price ($)") plt.ylabel("Frequency") plt.tight_layout() plt.show()
copy
1234567891011121314
import numpy as np # Simulated sales amounts for retail transactions sales_data = pd.DataFrame({ "sales_amount": np.random.gamma(shape=2.0, scale=20.0, size=100) }) plt.figure(figsize=(6, 4)) sns.kdeplot(sales_data["sales_amount"], fill=True, color="orange") plt.title("KDE Plot of Sales Amounts") plt.xlabel("Sales Amount ($)") plt.ylabel("Density") plt.tight_layout() plt.show()
copy
1234567891011
# Simulated transaction amounts with a few outliers transaction_data = pd.DataFrame({ "transaction_amount": [50, 60, 55, 52, 58, 54, 53, 56, 87, 51, 57, 59, 80] }) plt.figure(figsize=(4, 6)) sns.boxplot(y=transaction_data["transaction_amount"], color="lightgreen") plt.title("Boxplot of Transaction Amounts") plt.ylabel("Transaction Amount ($)") plt.tight_layout() plt.show()
copy

Each visualization helps you explore numerical features, but they serve different purposes in retail data analysis:

  • Histograms: show how frequently each value or range of values appears. You can quickly see where most product prices cluster and detect gaps or spikes;
  • KDE plots: provide a smoothed version of the distribution, making it easier to spot subtle peaks, tails, or overall shape in sales amounts;
  • Boxplots: summarize data using the median, quartiles, and highlight potential outliers. This makes it easy to spot unusually high or low transaction amounts that may need further investigation.

Use these plots together to get a complete picture of your numerical data's distribution and identify key patterns or outliers.

question mark

Which type of plot is most effective for visually identifying outliers in a retail dataset?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 2. Kapitel 1
some-alt