Unikke Værdier
Data bliver ofte duplikeret i DataFrames. For eksempel har countries
kolonnen i 'continent'
DataFrame gentagne indgange. Der er en metode, der henter et array af unikke værdier fra en specifik DataFrame kolonne.
9
1
2
3
4
5
6
7
import pandas as pd
country_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'],
'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'],
'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']}
countries = pd.DataFrame(country_data)
print(countries)
1234567import pandas as pd country_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(country_data) print(countries)
Nu vil vi anvende unique()
metoden på 'continent'
og 'country'
kolonnerne:
99
1
2
3
4
5
6
7
8
9
10
import pandas as pd
country_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'],
'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'],
'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']}
countries = pd.DataFrame(country_data)
unique_countries = countries['country'].unique()
unique_continents = countries['continent'].unique()
print(unique_countries)
print(unique_continents)
12345678910import pandas as pd country_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(country_data) unique_countries = countries['country'].unique() unique_continents = countries['continent'].unique() print(unique_countries) print(unique_continents)
For at tælle antallet af unikke værdier i en specifik kolonne, kan du bruge nunique()
metoden:
9
1
2
3
4
5
6
7
import pandas as pd
country_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'],
'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'],
'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']}
countries = pd.DataFrame(country_data)
print(countries['continent'].nunique())
1234567import pandas as pd country_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(country_data) print(countries['continent'].nunique())
Opgave
Swipe to start coding
Du har fået en DataFrame
ved navn audi_cars
.
- Identificer alle unikke værdier i
'year'
kolonnen og gem resultatet iunique_years
kolonnen. - Identificer alle unikke værdier i
'fueltype'
kolonnen og gem resultatet iunique_fueltype
variablen. - Bestem antallet af unikke brændstoftyper og gem resultatet i
count_unique_fueltypes
variablen.
Løsning
99
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import pandas as pd
cars_data = {'model': ['audi A1', 'audi A6', 'audi A4', 'audi A3','audi A1'],
'year': [2017, 2016, 2017, 2019, 2016],
'fueltype': ['petrol', 'diesel', 'diesel', 'petrol', 'petrol'],
'capital': ['Manila', 'Monaco', 'Bangkok', 'Stockhol', 'Valletta']}
audi_cars = pd.DataFrame(cars_data)
# Write your code below
unique_years = audi_cars['year'].unique()
unique_fueltype = audi_cars['fueltype'].unique()
count_unique_fueltypes = audi_cars['fueltype'].nunique()
# Testing the result
print(unique_years)
print(unique_fueltype)
print(count_unique_fueltypes)
Var alt klart?
Tak for dine kommentarer!
Sektion 3. Kapitel 15
99
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import pandas as pd
cars_data = {'model': ['audi A1', 'audi A6', 'audi A4', 'audi A3','audi A1'],
'year': [2017, 2016, 2017, 2019, 2016],
'fueltype': ['petrol', 'diesel', 'diesel', 'petrol', 'petrol'],
'capital': ['Manila', 'Monaco', 'Bangkok', 'Stockhol', 'Valletta']}
audi_cars = pd.DataFrame(cars_data)
# Write your code below
unique_years = ___
unique_fueltype = ___
count_unique_fueltypes = ___
# Testing the result
print(unique_years)
print(unique_fueltype)
print(count_unique_fueltypes)
Spørg AI
Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat