Unique Values
Data often gets duplicated in DataFrames. For instance, in the countries
DataFrame, the 'continent'
column has repeated entries. There's a method that retrieves an array of distinct values from a specific DataFrame column.
1234567import pandas as pd country_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(country_data) print(countries)
Now, we'll apply the unique()
method to the 'continent'
and 'country'
columns:
12345678910import pandas as pd country_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(country_data) unique_countries = countries['country'].unique() unique_continents = countries['continent'].unique() print(unique_countries) print(unique_continents)
To count the number of distinct values in a specific column, you can use the nunique()
method:
1234567import pandas as pd country_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(country_data) print(countries['continent'].nunique())
Swipe to start coding
You are given a DataFrame
named audi_cars
.
- Identify all distinct values in the
'year'
column and store the result in theunique_years
column. - Identify all distinct values in the
'fueltype'
column and store the result in theunique_fueltype
variable. - Determine the number of unique fuel types and store the result in the
count_unique_fueltypes
variable.
Solution
Thanks for your feedback!
single
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
Awesome!
Completion rate improved to 3.03Awesome!
Completion rate improved to 3.03
Unique Values
Data often gets duplicated in DataFrames. For instance, in the countries
DataFrame, the 'continent'
column has repeated entries. There's a method that retrieves an array of distinct values from a specific DataFrame column.
1234567import pandas as pd country_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(country_data) print(countries)
Now, we'll apply the unique()
method to the 'continent'
and 'country'
columns:
12345678910import pandas as pd country_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(country_data) unique_countries = countries['country'].unique() unique_continents = countries['continent'].unique() print(unique_countries) print(unique_continents)
To count the number of distinct values in a specific column, you can use the nunique()
method:
1234567import pandas as pd country_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(country_data) print(countries['continent'].nunique())
Swipe to start coding
You are given a DataFrame
named audi_cars
.
- Identify all distinct values in the
'year'
column and store the result in theunique_years
column. - Identify all distinct values in the
'fueltype'
column and store the result in theunique_fueltype
variable. - Determine the number of unique fuel types and store the result in the
count_unique_fueltypes
variable.
Solution
Thanks for your feedback!
single
Awesome!
Completion rate improved to 3.03
Unique Values
Swipe to show menu
Data often gets duplicated in DataFrames. For instance, in the countries
DataFrame, the 'continent'
column has repeated entries. There's a method that retrieves an array of distinct values from a specific DataFrame column.
1234567import pandas as pd country_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(country_data) print(countries)
Now, we'll apply the unique()
method to the 'continent'
and 'country'
columns:
12345678910import pandas as pd country_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(country_data) unique_countries = countries['country'].unique() unique_continents = countries['continent'].unique() print(unique_countries) print(unique_continents)
To count the number of distinct values in a specific column, you can use the nunique()
method:
1234567import pandas as pd country_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(country_data) print(countries['continent'].nunique())
Swipe to start coding
You are given a DataFrame
named audi_cars
.
- Identify all distinct values in the
'year'
column and store the result in theunique_years
column. - Identify all distinct values in the
'fueltype'
column and store the result in theunique_fueltype
variable. - Determine the number of unique fuel types and store the result in the
count_unique_fueltypes
variable.
Solution
Thanks for your feedback!