Course Content
Pandas First Steps
Pandas First Steps
Unique Values
Data often gets duplicated in DataFrames. For instance, in our countries
DataFrame, the continent
column has repeated entries. There's a function that retrieves an array of distinct values from a specific DataFrame column. Let's revisit this DataFrame.
import pandas as pd dataset = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(dataset) print(countries)
Now, let's apply the unique()
method to the 'continent'
and 'country'
columns.
import pandas as pd dataset = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(dataset) unique_countries = countries['country'].unique() unique_continents = countries['continent'].unique() print(unique_countries) print(unique_continents)
Task
Given the audi_cars
DataFrame, please identify all distinct values in the 'year'
and 'fueltype'
columns.
Task
Given the audi_cars
DataFrame, please identify all distinct values in the 'year'
and 'fueltype'
columns.
Everything was clear?
Unique Values
Data often gets duplicated in DataFrames. For instance, in our countries
DataFrame, the continent
column has repeated entries. There's a function that retrieves an array of distinct values from a specific DataFrame column. Let's revisit this DataFrame.
import pandas as pd dataset = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(dataset) print(countries)
Now, let's apply the unique()
method to the 'continent'
and 'country'
columns.
import pandas as pd dataset = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(dataset) unique_countries = countries['country'].unique() unique_continents = countries['continent'].unique() print(unique_countries) print(unique_continents)
Task
Given the audi_cars
DataFrame, please identify all distinct values in the 'year'
and 'fueltype'
columns.
Task
Given the audi_cars
DataFrame, please identify all distinct values in the 'year'
and 'fueltype'
columns.
Everything was clear?
Unique Values
Data often gets duplicated in DataFrames. For instance, in our countries
DataFrame, the continent
column has repeated entries. There's a function that retrieves an array of distinct values from a specific DataFrame column. Let's revisit this DataFrame.
import pandas as pd dataset = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(dataset) print(countries)
Now, let's apply the unique()
method to the 'continent'
and 'country'
columns.
import pandas as pd dataset = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(dataset) unique_countries = countries['country'].unique() unique_continents = countries['continent'].unique() print(unique_countries) print(unique_continents)
Task
Given the audi_cars
DataFrame, please identify all distinct values in the 'year'
and 'fueltype'
columns.
Task
Given the audi_cars
DataFrame, please identify all distinct values in the 'year'
and 'fueltype'
columns.
Everything was clear?
Data often gets duplicated in DataFrames. For instance, in our countries
DataFrame, the continent
column has repeated entries. There's a function that retrieves an array of distinct values from a specific DataFrame column. Let's revisit this DataFrame.
import pandas as pd dataset = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(dataset) print(countries)
Now, let's apply the unique()
method to the 'continent'
and 'country'
columns.
import pandas as pd dataset = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(dataset) unique_countries = countries['country'].unique() unique_continents = countries['continent'].unique() print(unique_countries) print(unique_continents)
Task
Given the audi_cars
DataFrame, please identify all distinct values in the 'year'
and 'fueltype'
columns.