Course Content
Introduction to Python for Data Analysis
Introduction to Python for Data Analysis
Visualization: First Steps
An essential tool for Data Analysts is visualization. The first one here is .barplot()
. To use the tools you need to import the libraries, look at the syntax:
import matplotlib.pyplot as plt
import seaborn as sns
We will use the second one, Seaborn
, but it is based on Matplotlib
, so we need to import two of them. Look at the dataset that we used to use for examples:
Our task is to visualize experience_level
and the mean
salary for each of them. Look at the code:
import matplotlib.pyplot as plt import seaborn as sns import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/INTRO+to+Python/ds_salaries.csv', index_col = 0) df = df[['experience_level', 'salary']].groupby(['experience_level']).median().reset_index() sns.barplot(data = df, x = 'experience_level', y = 'salary') plt.show()
Here, is the output
Look at the sixth line of code:
df = df[['experience_level', 'salary']].groupby(['experience_level']).median().reset_index()
Here you can recognize the new function .reset_index()
. It is easy and just transforms the result of .groupby()
function into the regular dataset. Look at the pictures (the first one is before and the second one is after):
Then we will move to the seventh line of code.
sns.barplot(data = df, x = 'experience_level', y = 'salary')
sns
- referring toseaborn
library.barplot
the type of plot.data = df
the DataFrame.x = 'experience_level'
the column for x-axis.y = 'salary'
the column for y-axis.
Move to the eighth line of code:
plt.show()
Function from the matplotlib
library to output the plot.
Task
Visualize the sum of money you receive from users depending on their subscription plan.
- Import the
seaborn
with thesns
alias. - Import the
matplotlib.pyplot
with theplt
alias. - Prepare data for visualization using the
.groupby()
function:
- Extract columns
'plan', 'price'
. - Group by column
plan
. - Calculate the
sum
of all prices for eachplan
. - Reset indices.
- Create the
barplot
using theseaborn
:
- Use
df
as thedata
argument - Use the
'plan'
column for the x-axis - Use the
'price'
column for the y-axis.
- Output the plot.
Thanks for your feedback!
Visualization: First Steps
An essential tool for Data Analysts is visualization. The first one here is .barplot()
. To use the tools you need to import the libraries, look at the syntax:
import matplotlib.pyplot as plt
import seaborn as sns
We will use the second one, Seaborn
, but it is based on Matplotlib
, so we need to import two of them. Look at the dataset that we used to use for examples:
Our task is to visualize experience_level
and the mean
salary for each of them. Look at the code:
import matplotlib.pyplot as plt import seaborn as sns import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/INTRO+to+Python/ds_salaries.csv', index_col = 0) df = df[['experience_level', 'salary']].groupby(['experience_level']).median().reset_index() sns.barplot(data = df, x = 'experience_level', y = 'salary') plt.show()
Here, is the output
Look at the sixth line of code:
df = df[['experience_level', 'salary']].groupby(['experience_level']).median().reset_index()
Here you can recognize the new function .reset_index()
. It is easy and just transforms the result of .groupby()
function into the regular dataset. Look at the pictures (the first one is before and the second one is after):
Then we will move to the seventh line of code.
sns.barplot(data = df, x = 'experience_level', y = 'salary')
sns
- referring toseaborn
library.barplot
the type of plot.data = df
the DataFrame.x = 'experience_level'
the column for x-axis.y = 'salary'
the column for y-axis.
Move to the eighth line of code:
plt.show()
Function from the matplotlib
library to output the plot.
Task
Visualize the sum of money you receive from users depending on their subscription plan.
- Import the
seaborn
with thesns
alias. - Import the
matplotlib.pyplot
with theplt
alias. - Prepare data for visualization using the
.groupby()
function:
- Extract columns
'plan', 'price'
. - Group by column
plan
. - Calculate the
sum
of all prices for eachplan
. - Reset indices.
- Create the
barplot
using theseaborn
:
- Use
df
as thedata
argument - Use the
'plan'
column for the x-axis - Use the
'price'
column for the y-axis.
- Output the plot.
Thanks for your feedback!
Visualization: First Steps
An essential tool for Data Analysts is visualization. The first one here is .barplot()
. To use the tools you need to import the libraries, look at the syntax:
import matplotlib.pyplot as plt
import seaborn as sns
We will use the second one, Seaborn
, but it is based on Matplotlib
, so we need to import two of them. Look at the dataset that we used to use for examples:
Our task is to visualize experience_level
and the mean
salary for each of them. Look at the code:
import matplotlib.pyplot as plt import seaborn as sns import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/INTRO+to+Python/ds_salaries.csv', index_col = 0) df = df[['experience_level', 'salary']].groupby(['experience_level']).median().reset_index() sns.barplot(data = df, x = 'experience_level', y = 'salary') plt.show()
Here, is the output
Look at the sixth line of code:
df = df[['experience_level', 'salary']].groupby(['experience_level']).median().reset_index()
Here you can recognize the new function .reset_index()
. It is easy and just transforms the result of .groupby()
function into the regular dataset. Look at the pictures (the first one is before and the second one is after):
Then we will move to the seventh line of code.
sns.barplot(data = df, x = 'experience_level', y = 'salary')
sns
- referring toseaborn
library.barplot
the type of plot.data = df
the DataFrame.x = 'experience_level'
the column for x-axis.y = 'salary'
the column for y-axis.
Move to the eighth line of code:
plt.show()
Function from the matplotlib
library to output the plot.
Task
Visualize the sum of money you receive from users depending on their subscription plan.
- Import the
seaborn
with thesns
alias. - Import the
matplotlib.pyplot
with theplt
alias. - Prepare data for visualization using the
.groupby()
function:
- Extract columns
'plan', 'price'
. - Group by column
plan
. - Calculate the
sum
of all prices for eachplan
. - Reset indices.
- Create the
barplot
using theseaborn
:
- Use
df
as thedata
argument - Use the
'plan'
column for the x-axis - Use the
'price'
column for the y-axis.
- Output the plot.
Thanks for your feedback!
An essential tool for Data Analysts is visualization. The first one here is .barplot()
. To use the tools you need to import the libraries, look at the syntax:
import matplotlib.pyplot as plt
import seaborn as sns
We will use the second one, Seaborn
, but it is based on Matplotlib
, so we need to import two of them. Look at the dataset that we used to use for examples:
Our task is to visualize experience_level
and the mean
salary for each of them. Look at the code:
import matplotlib.pyplot as plt import seaborn as sns import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/INTRO+to+Python/ds_salaries.csv', index_col = 0) df = df[['experience_level', 'salary']].groupby(['experience_level']).median().reset_index() sns.barplot(data = df, x = 'experience_level', y = 'salary') plt.show()
Here, is the output
Look at the sixth line of code:
df = df[['experience_level', 'salary']].groupby(['experience_level']).median().reset_index()
Here you can recognize the new function .reset_index()
. It is easy and just transforms the result of .groupby()
function into the regular dataset. Look at the pictures (the first one is before and the second one is after):
Then we will move to the seventh line of code.
sns.barplot(data = df, x = 'experience_level', y = 'salary')
sns
- referring toseaborn
library.barplot
the type of plot.data = df
the DataFrame.x = 'experience_level'
the column for x-axis.y = 'salary'
the column for y-axis.
Move to the eighth line of code:
plt.show()
Function from the matplotlib
library to output the plot.
Task
Visualize the sum of money you receive from users depending on their subscription plan.
- Import the
seaborn
with thesns
alias. - Import the
matplotlib.pyplot
with theplt
alias. - Prepare data for visualization using the
.groupby()
function:
- Extract columns
'plan', 'price'
. - Group by column
plan
. - Calculate the
sum
of all prices for eachplan
. - Reset indices.
- Create the
barplot
using theseaborn
:
- Use
df
as thedata
argument - Use the
'plan'
column for the x-axis - Use the
'price'
column for the y-axis.
- Output the plot.