Visualization: First Steps
An essential tool for Data Analysts is visualization. The first one here is .barplot()
. To use the tools you need to import the libraries, look at the syntax:
import matplotlib.pyplot as plt
import seaborn as sns
We will use the second one, Seaborn
, but it is based on Matplotlib
, so we need to import two of them. Look at the dataset that we used to use for examples:
Our task is to visualize experience_level
and the mean
salary for each of them. Look at the code:
12345678import matplotlib.pyplot as plt import seaborn as sns import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/INTRO+to+Python/ds_salaries.csv', index_col = 0) df = df[['experience_level', 'salary']].groupby(['experience_level']).median().reset_index() sns.barplot(data = df, x = 'experience_level', y = 'salary') plt.show()
Here, is the output
Look at the sixth line of code:
df = df[['experience_level', 'salary']].groupby(['experience_level']).median().reset_index()
Here you can recognize the new function .reset_index()
. It is easy and just transforms the result of .groupby()
function into the regular dataset. Look at the pictures (the first one is before and the second one is after):
Then we will move to the seventh line of code.
sns.barplot(data = df, x = 'experience_level', y = 'salary')
sns
- referring toseaborn
library.barplot
the type of plot.data = df
the DataFrame.x = 'experience_level'
the column for x-axis.y = 'salary'
the column for y-axis.
Move to the eighth line of code:
plt.show()
Function from the matplotlib
library to output the plot.
Swipe to start coding
Visualize the sum of money you receive from users depending on their subscription plan.
- Import the
seaborn
with thesns
alias. - Import the
matplotlib.pyplot
with theplt
alias. - Prepare data for visualization using the
.groupby()
function:
- Extract columns
'plan', 'price'
. - Group by column
plan
. - Calculate the
sum
of all prices for eachplan
. - Reset indices.
- Create the
barplot
using theseaborn
:
- Use
df
as thedata
argument - Use the
'plan'
column for the x-axis - Use the
'price'
column for the y-axis.
- Output the plot.
Solution
Thanks for your feedback!
single
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
Awesome!
Completion rate improved to 2.08
Visualization: First Steps
Swipe to show menu
An essential tool for Data Analysts is visualization. The first one here is .barplot()
. To use the tools you need to import the libraries, look at the syntax:
import matplotlib.pyplot as plt
import seaborn as sns
We will use the second one, Seaborn
, but it is based on Matplotlib
, so we need to import two of them. Look at the dataset that we used to use for examples:
Our task is to visualize experience_level
and the mean
salary for each of them. Look at the code:
12345678import matplotlib.pyplot as plt import seaborn as sns import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/INTRO+to+Python/ds_salaries.csv', index_col = 0) df = df[['experience_level', 'salary']].groupby(['experience_level']).median().reset_index() sns.barplot(data = df, x = 'experience_level', y = 'salary') plt.show()
Here, is the output
Look at the sixth line of code:
df = df[['experience_level', 'salary']].groupby(['experience_level']).median().reset_index()
Here you can recognize the new function .reset_index()
. It is easy and just transforms the result of .groupby()
function into the regular dataset. Look at the pictures (the first one is before and the second one is after):
Then we will move to the seventh line of code.
sns.barplot(data = df, x = 'experience_level', y = 'salary')
sns
- referring toseaborn
library.barplot
the type of plot.data = df
the DataFrame.x = 'experience_level'
the column for x-axis.y = 'salary'
the column for y-axis.
Move to the eighth line of code:
plt.show()
Function from the matplotlib
library to output the plot.
Swipe to start coding
Visualize the sum of money you receive from users depending on their subscription plan.
- Import the
seaborn
with thesns
alias. - Import the
matplotlib.pyplot
with theplt
alias. - Prepare data for visualization using the
.groupby()
function:
- Extract columns
'plan', 'price'
. - Group by column
plan
. - Calculate the
sum
of all prices for eachplan
. - Reset indices.
- Create the
barplot
using theseaborn
:
- Use
df
as thedata
argument - Use the
'plan'
column for the x-axis - Use the
'price'
column for the y-axis.
- Output the plot.
Solution
Thanks for your feedback!
Awesome!
Completion rate improved to 2.08single