KDE Plot
Kernel density estimation (KDE) plot is a plot used to visualize the probability density function estimation. It is in a way similar to a histogram which we discussed in the previous section, however, the KDE plot is a continuous curve, not a set of bars, and is based on all of the data points rather than the intervals. Let’s have a look at an example of a KDE plot:
As you can see, here we have a histogram combined with a KDE plot (orange curve). This combination gives us a much clearer probability density function approximation than a single histogram.
With seaborn
creating a KDE plot is as simple as it gets, since there is a special kdeplot()
function. Its most important parameters data
, x
and y
work the same way as in the countplot()
function.
First Option
We can simply set only one of these parameters via passing a sequence of values. Here is an example to clarify everything:
We only set the value for the data
parameter passing a Series
object and use the fill
parameter to fill in the area under the curve (it is not filled in by default).
Second Option
It is also possible to set a 2D object like a DataFrame
for data
and a column name (or a key if the data
is a dictionary) for x
(vertical orientation) or y
(horizontal orientation):
We achieved the same results passing the whole DataFrame
as the data
parameter and the column name for the x
parameter.
By the way, the KDE plot we created has a characteristic bell curve and closely resembles the normal distribution with the mean of approximately 52°F.
In case you want to explore more about the kdeplot()
function, feel free to refer to its documentation.
Tarea
- Use the correct function to create a KDE plot.
- Use
countries_df
as the data for the plot (the first argument). - Set
'GDP per capita'
as the column to use and the orientation to horizontal via the second argument. - Fill in the area under the curve via the third (rightmost) argument.
¿Todo estuvo claro?
Contenido del Curso
Ultimate Visualization with Python
1. Matplotlib Introduction
2. Creating Commonly Used Plots
5. Plotting with Seaborn
Ultimate Visualization with Python
KDE Plot
Kernel density estimation (KDE) plot is a plot used to visualize the probability density function estimation. It is in a way similar to a histogram which we discussed in the previous section, however, the KDE plot is a continuous curve, not a set of bars, and is based on all of the data points rather than the intervals. Let’s have a look at an example of a KDE plot:
As you can see, here we have a histogram combined with a KDE plot (orange curve). This combination gives us a much clearer probability density function approximation than a single histogram.
With seaborn
creating a KDE plot is as simple as it gets, since there is a special kdeplot()
function. Its most important parameters data
, x
and y
work the same way as in the countplot()
function.
First Option
We can simply set only one of these parameters via passing a sequence of values. Here is an example to clarify everything:
We only set the value for the data
parameter passing a Series
object and use the fill
parameter to fill in the area under the curve (it is not filled in by default).
Second Option
It is also possible to set a 2D object like a DataFrame
for data
and a column name (or a key if the data
is a dictionary) for x
(vertical orientation) or y
(horizontal orientation):
We achieved the same results passing the whole DataFrame
as the data
parameter and the column name for the x
parameter.
By the way, the KDE plot we created has a characteristic bell curve and closely resembles the normal distribution with the mean of approximately 52°F.
In case you want to explore more about the kdeplot()
function, feel free to refer to its documentation.
Tarea
- Use the correct function to create a KDE plot.
- Use
countries_df
as the data for the plot (the first argument). - Set
'GDP per capita'
as the column to use and the orientation to horizontal via the second argument. - Fill in the area under the curve via the third (rightmost) argument.
¿Todo estuvo claro?