Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Вивчайте Analyzing Numerical–Categorical Relationships | Bivariate and Correlation Analysis
Exploratory Data Analysis with Python

bookAnalyzing Numerical–Categorical Relationships

To analyze how a numerical feature—like sales amount—varies across categories in your retail dataset, compare the distribution of that feature for each category.

This approach helps you answer questions such as:

  • Do some product types have higher average sales than others?;
  • Is the spread of sales wider for some store locations?;
  • Which product categories are most profitable?;
  • Which customer segments tend to spend more?.

By comparing numerical data across categories, you can uncover patterns that inform business decisions and highlight key differences between groups.

123456789101112131415161718
import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # Example retail dataset data = { "ProductCategory": ["Electronics", "Clothing", "Electronics", "Furniture", "Clothing", "Furniture", "Electronics", "Clothing", "Furniture", "Electronics"], "SalesAmount": [200, 150, 300, 400, 160, 380, 150, 95, 420, 250] } df = pd.DataFrame(data) # Create a boxplot to compare sales amounts across product categories plt.figure(figsize=(8, 5)) sns.boxplot(x="ProductCategory", y="SalesAmount", data=df) plt.title("Sales Amount Distribution by Product Category") plt.xlabel("Product Category") plt.ylabel("Sales Amount") plt.show()
copy

When you interpret boxplots that compare sales amounts across product categories:

  • The median line inside each box shows the typical sales value for that category;
  • The spread (the height of the box and the length of the "whiskers") shows how variable sales amounts are within that category;
  • Outliers are shown as individual points outside the whiskers; these may indicate unusually high or low sales for certain products;
  • Differences in medians reveal which categories tend to sell more or less;
  • Differences in spread and outliers can highlight categories with inconsistent sales or rare but significant transactions;
question mark

Which of the following statements correctly describe how to interpret boxplots comparing sales amounts across product categories

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 3. Розділ 2

Запитати АІ

expand

Запитати АІ

ChatGPT

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

Awesome!

Completion rate improved to 5.56

bookAnalyzing Numerical–Categorical Relationships

Свайпніть щоб показати меню

To analyze how a numerical feature—like sales amount—varies across categories in your retail dataset, compare the distribution of that feature for each category.

This approach helps you answer questions such as:

  • Do some product types have higher average sales than others?;
  • Is the spread of sales wider for some store locations?;
  • Which product categories are most profitable?;
  • Which customer segments tend to spend more?.

By comparing numerical data across categories, you can uncover patterns that inform business decisions and highlight key differences between groups.

123456789101112131415161718
import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # Example retail dataset data = { "ProductCategory": ["Electronics", "Clothing", "Electronics", "Furniture", "Clothing", "Furniture", "Electronics", "Clothing", "Furniture", "Electronics"], "SalesAmount": [200, 150, 300, 400, 160, 380, 150, 95, 420, 250] } df = pd.DataFrame(data) # Create a boxplot to compare sales amounts across product categories plt.figure(figsize=(8, 5)) sns.boxplot(x="ProductCategory", y="SalesAmount", data=df) plt.title("Sales Amount Distribution by Product Category") plt.xlabel("Product Category") plt.ylabel("Sales Amount") plt.show()
copy

When you interpret boxplots that compare sales amounts across product categories:

  • The median line inside each box shows the typical sales value for that category;
  • The spread (the height of the box and the length of the "whiskers") shows how variable sales amounts are within that category;
  • Outliers are shown as individual points outside the whiskers; these may indicate unusually high or low sales for certain products;
  • Differences in medians reveal which categories tend to sell more or less;
  • Differences in spread and outliers can highlight categories with inconsistent sales or rare but significant transactions;
question mark

Which of the following statements correctly describe how to interpret boxplots comparing sales amounts across product categories

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 3. Розділ 2
some-alt