Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
学ぶ Analyzing Data with Box Plots | Section
Data Visualization with Matplotlib
セクション 1.  16
single

single

bookAnalyzing Data with Box Plots

メニューを表示するにはスワイプしてください

Note
Definition

Box plot is another extremely common plot in statistics used to visualize the central tendency, spread, and potential outliers within the data via their quartiles.

Quartiles

quartiles

Quartiles split sorted data into four equal parts:

  • Q1 — the midpoint between the minimum and the median (25% of data below it);
  • Q2 — the median (50% of data below);
  • Q3 — the midpoint between the median and the maximum (75% of data below).

Box Plot Elements

box_plot_explained
  • The left side of the box shows Q1, the right side shows Q3;
  • IQR = Q3 − Q1, shown as the width of the box, with the median marked by a yellow line;
  • Whiskers extend to (Q1 - 1.5 \cdot IQR) and (Q3 + 1.5 \cdot IQR);
  • Points outside the whiskers are outliers.

A box plot can be generated using matplotlib.

1234567891011
import pandas as pd import matplotlib.pyplot as plt # Loading the dataset with the average yearly temperatures in Boston and Seattle url = 'https://content-media-cdn.codefinity.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/weather_data.csv' weather_df = pd.read_csv(url, index_col=0) # Creating a box plot for the Seattle temperatures plt.boxplot(weather_df['Seattle']) plt.show()
copy

Box Plot Data

Use plt.boxplot(x), where x can be a 1D array-like object, a 2D array (one box per column), or a sequence of 1D arrays.

Optional Parameters

tick_labels is useful for naming box plots — especially when plotting multiple arrays.

12345678910
import pandas as pd import matplotlib.pyplot as plt # Loading the dataset with the average yearly temperatures in Boston and Seattle url = 'https://content-media-cdn.codefinity.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/weather_data.csv' weather_df = pd.read_csv(url, index_col=0) # Creating two box plots for Boston and Seattle temperatures plt.boxplot(weather_df, tick_labels=['Boston', 'Seattle']) plt.show()
copy

Passing a DataFrame with two numeric columns to boxplot() creates two separate box plots with labels automatically assigned.

Note
Study More

There are also quite a bit of optional parameters for customizing the box plot, which you can explore in boxplot() documentation, yet in practice you might rarely use them.

タスク

スワイプしてコーディングを開始

Create two box plots using two samples from the standard normal distribution:

  1. Use the correct function to create the box plots.
  2. Use the list of normal_sample_1 and normal_sample_2 (in this order from left to right) as the data.
  3. Label the left box plot as First sample and the right one as Second sample using the list.

解答

Switch to desktop実践的な練習のためにデスクトップに切り替える下記のオプションのいずれかを利用して、現在の場所から続行する
すべて明確でしたか?

どのように改善できますか?

フィードバックありがとうございます!

セクション 1.  16
single

single

AIに質問する

expand

AIに質問する

ChatGPT

何でも質問するか、提案された質問の1つを試してチャットを始めてください

some-alt