Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Joint Plot | Plotting with Seaborn
Ultimate Visualization with Python
course content

Зміст курсу

Ultimate Visualization with Python

Ultimate Visualization with Python

1. Matplotlib Introduction
2. Creating Commonly Used Plots
3. Plots Customization
4. More Statistical Plots
5. Plotting with Seaborn

book
Joint Plot

Joint plot is a rather unique plot, since it combines multiple plots. Basically, it has three elements by default:

  • histogram on the top which represents the distribution of a certain variable;
  • histogram on the right which represents the distribution of another variable;
  • scatter plot in the middle which shows the relationship between these two variables.

Here is an example of a joint plot:

Data for the Joint Plot

seaborn has a jointplot() function which, similarly to countplot() and kdeplot(), has three most important parameters:

  • data;
  • x;
  • y.

x and y parameters are the variables we are interested in (the right and top histogram respectively), they can either be array-like objects or the names of the columns of a DataFrame (if we also set the data parameter as a DataFrame).

Let’s have a look at an example:

123456
import seaborn as sns import matplotlib.pyplot as plt # Loading the dataset with data about three different iris flowers species iris_df = sns.load_dataset("iris") sns.jointplot(data=iris_df, x="sepal_length", y="sepal_width") plt.show()
copy

We have just recreated the example we had at the beginning by setting a DataFrame object for the data parameter and the names of the columns for x and y.

Plot in the Middle

Another quite useful parameter is kind which specifies the plot you have in the middle. 'scatter' is its default value. Here are other possible plots: 'kde', 'hist', 'hex', 'reg', 'resid'. Feel free to experiment with different plots:

123456
import seaborn as sns import matplotlib.pyplot as plt # Loading the dataset with data about three different iris flowers species iris_df = sns.load_dataset("iris") sns.jointplot(data=iris_df, x="sepal_length", y="sepal_width", kind='reg') plt.show()
copy

Plot Kinds

Although scatter plot is mostly used for the plot in the middle, here are some other plots just for information:

  • 'reg' creates a linear regression model fit along with the scatter plot, which is useful to check whether two variables are correlated;
  • 'resid' plots the residuals of a linear regression (documentation);
  • 'hist' creates a bivariate histogram (for two variables);
  • 'kde' creates a KDE plot;
  • 'hex' creates a hexbin plot. It's a scatter plot where hexagonal bins are used instead of individual data points, and the color of each bin indicates how many data points fall within it.

As usual, feel free to explore more parameters in the documentaion.

Завдання
test

Swipe to show code editor

  1. Use the correct function to create a joint plot.
  2. Use weather_df as the data for the plot (the first argument).
  3. Set the 'Boston' column for the x-axis variable (the second argument).
  4. Set the 'Seattle' column for the y-axis variable (the third argument).
  5. Set the plot in the middle to have a regression line (the rightmost argument).

Рішення

Switch to desktopПерейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 5. Розділ 5
toggle bottom row

book
Joint Plot

Joint plot is a rather unique plot, since it combines multiple plots. Basically, it has three elements by default:

  • histogram on the top which represents the distribution of a certain variable;
  • histogram on the right which represents the distribution of another variable;
  • scatter plot in the middle which shows the relationship between these two variables.

Here is an example of a joint plot:

Data for the Joint Plot

seaborn has a jointplot() function which, similarly to countplot() and kdeplot(), has three most important parameters:

  • data;
  • x;
  • y.

x and y parameters are the variables we are interested in (the right and top histogram respectively), they can either be array-like objects or the names of the columns of a DataFrame (if we also set the data parameter as a DataFrame).

Let’s have a look at an example:

123456
import seaborn as sns import matplotlib.pyplot as plt # Loading the dataset with data about three different iris flowers species iris_df = sns.load_dataset("iris") sns.jointplot(data=iris_df, x="sepal_length", y="sepal_width") plt.show()
copy

We have just recreated the example we had at the beginning by setting a DataFrame object for the data parameter and the names of the columns for x and y.

Plot in the Middle

Another quite useful parameter is kind which specifies the plot you have in the middle. 'scatter' is its default value. Here are other possible plots: 'kde', 'hist', 'hex', 'reg', 'resid'. Feel free to experiment with different plots:

123456
import seaborn as sns import matplotlib.pyplot as plt # Loading the dataset with data about three different iris flowers species iris_df = sns.load_dataset("iris") sns.jointplot(data=iris_df, x="sepal_length", y="sepal_width", kind='reg') plt.show()
copy

Plot Kinds

Although scatter plot is mostly used for the plot in the middle, here are some other plots just for information:

  • 'reg' creates a linear regression model fit along with the scatter plot, which is useful to check whether two variables are correlated;
  • 'resid' plots the residuals of a linear regression (documentation);
  • 'hist' creates a bivariate histogram (for two variables);
  • 'kde' creates a KDE plot;
  • 'hex' creates a hexbin plot. It's a scatter plot where hexagonal bins are used instead of individual data points, and the color of each bin indicates how many data points fall within it.

As usual, feel free to explore more parameters in the documentaion.

Завдання
test

Swipe to show code editor

  1. Use the correct function to create a joint plot.
  2. Use weather_df as the data for the plot (the first argument).
  3. Set the 'Boston' column for the x-axis variable (the second argument).
  4. Set the 'Seattle' column for the y-axis variable (the third argument).
  5. Set the plot in the middle to have a regression line (the rightmost argument).

Рішення

Switch to desktopПерейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 5. Розділ 5
Switch to desktopПерейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
We're sorry to hear that something went wrong. What happened?
some-alt