Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære KDE Plot | Plotting with Seaborn
Ultimate Visualization with Python

book
KDE Plot

Kernel density estimation (KDE) plot is a plot used to visualize the probability density function estimation. It is in a way similar to a histogram which we discussed in the previous section, however, the KDE plot is a continuous curve, not a set of bars, and is based on all of the data points rather than the intervals. Let’s have a look at an example of a KDE plot:

As you can see, here we have a histogram combined with a KDE plot (orange curve). This combination gives us a much clearer probability density function approximation than a single histogram.

With seaborn creating a KDE plot is as simple as it gets, since there is a special kdeplot() function. Its most important parameters data, x and y work the same way as in the countplot() function.

First Option

We can simply set only one of these parameters via passing a sequence of values. Here is an example to clarify everything:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
url = 'https://content-media-cdn.codefinity.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/weather_data.csv'
# Loading the dataset with the average yearly temperatures in Boston and Seattle
weather_df = pd.read_csv(url, index_col=0)
# Creating a KDE plot setting only the data parameter
sns.kdeplot(data=weather_df['Seattle'], fill=True)
plt.show()
123456789
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns url = 'https://content-media-cdn.codefinity.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/weather_data.csv' # Loading the dataset with the average yearly temperatures in Boston and Seattle weather_df = pd.read_csv(url, index_col=0) # Creating a KDE plot setting only the data parameter sns.kdeplot(data=weather_df['Seattle'], fill=True) plt.show()
copy

We only set the value for the data parameter passing a Series object and use the fill parameter to fill in the area under the curve (it is not filled in by default).

Second Option

It is also possible to set a 2D object like a DataFrame for data and a column name (or a key if the data is a dictionary) for x (vertical orientation) or y (horizontal orientation):

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
url = 'https://content-media-cdn.codefinity.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/weather_data.csv'
weather_df = pd.read_csv(url, index_col=0)
# Creating a KDE plot setting both the data and x parameters
sns.kdeplot(data=weather_df, x='Seattle', fill=True)
plt.show()
12345678
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns url = 'https://content-media-cdn.codefinity.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/weather_data.csv' weather_df = pd.read_csv(url, index_col=0) # Creating a KDE plot setting both the data and x parameters sns.kdeplot(data=weather_df, x='Seattle', fill=True) plt.show()
copy

We achieved the same results passing the whole DataFrame as the data parameter and the column name for the x parameter.

By the way, the KDE plot we created has a characteristic bell curve and closely resembles the normal distribution with the mean of approximately 52°F.

In case you want to explore more about the kdeplot() function, feel free to refer to its documentation.

Oppgave

Swipe to start coding

  1. Use the correct function to create a KDE plot.
  2. Use countries_df as the data for the plot (the first argument).
  3. Set 'GDP per capita' as the column to use and the orientation to horizontal via the second argument.
  4. Fill in the area under the curve via the third (rightmost) argument.

Løsning

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
url = 'https://content-media-cdn.codefinity.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/countries_data.csv'
# Loading the dataset with the countries data
countries_df = pd.read_csv(url, index_col=0)
# Create a KDE plot
sns.kdeplot(data=countries_df, y='GDP per capita', fill=True)
plt.show()

Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 5. Kapittel 4
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
url = 'https://content-media-cdn.codefinity.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/countries_data.csv'
# Loading the dataset with the countries data
countries_df = pd.read_csv(url, index_col=0)
# Create a KDE plot
___.___(___=___, ___='___', ___=___)
plt.show()

Spør AI

expand
ChatGPT

Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår

We use cookies to make your experience better!
some-alt