Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Вивчайте Box Plot | Section
Data Visualization & EDA
Секція 1. Розділ 16
single

single

bookBox Plot

Свайпніть щоб показати меню

Note
Definition

Box plot is another extremely common plot in statistics used to visualize the central tendency, spread, and potential outliers within the data via their quartiles.

Quartiles

quartiles

Quartiles split sorted data into four equal parts:

  • Q1 — the midpoint between the minimum and the median (25% of data below it);
  • Q2 — the median (50% of data below);
  • Q3 — the midpoint between the median and the maximum (75% of data below).

Box Plot Elements

box_plot_explained
  • The left side of the box shows Q1, the right side shows Q3;
  • IQR = Q3 − Q1, shown as the width of the box, with the median marked by a yellow line;
  • Whiskers extend to (Q1 - 1.5 \cdot IQR) and (Q3 + 1.5 \cdot IQR);
  • Points outside the whiskers are outliers.

A box plot can be generated using matplotlib.

1234567891011
import pandas as pd import matplotlib.pyplot as plt # Loading the dataset with the average yearly temperatures in Boston and Seattle url = 'https://content-media-cdn.codefinity.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/weather_data.csv' weather_df = pd.read_csv(url, index_col=0) # Creating a box plot for the Seattle temperatures plt.boxplot(weather_df['Seattle']) plt.show()
copy

Box Plot Data

Use plt.boxplot(x), where x can be a 1D array-like object, a 2D array (one box per column), or a sequence of 1D arrays.

Optional Parameters

tick_labels is useful for naming box plots — especially when plotting multiple arrays.

12345678910
import pandas as pd import matplotlib.pyplot as plt # Loading the dataset with the average yearly temperatures in Boston and Seattle url = 'https://content-media-cdn.codefinity.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/weather_data.csv' weather_df = pd.read_csv(url, index_col=0) # Creating two box plots for Boston and Seattle temperatures plt.boxplot(weather_df, tick_labels=['Boston', 'Seattle']) plt.show()
copy

Passing a DataFrame with two numeric columns to boxplot() creates two separate box plots with labels automatically assigned.

Note
Study More

There are also quite a bit of optional parameters for customizing the box plot, which you can explore in boxplot() documentation, yet in practice you might rarely use them.

Завдання

Swipe to start coding

Create two box plots using two samples from the standard normal distribution:

  1. Use the correct function to create the box plots.
  2. Use the list of normal_sample_1 and normal_sample_2 (in this order from left to right) as the data.
  3. Label the left box plot as First sample and the right one as Second sample using the list.

Рішення

Switch to desktopПерейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 1. Розділ 16
single

single

Запитати АІ

expand

Запитати АІ

ChatGPT

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

some-alt