Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Oppiskele Is Data in ...? | Extracting Data
Advanced Techniques in pandas

book
Is Data in ...?

In this section, we will continue extracting data using specific conditions. Here, you will become familiar with the helpful method called .isin(). But firstly, you need to examine the dataset. Look at the first five rows:

import pandas as pd
data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/4bf24830-59ba-4418-969b-aaf8117d522e/cars.csv', index_col = 0)
print(data.head())
123
import pandas as pd data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/4bf24830-59ba-4418-969b-aaf8117d522e/cars.csv', index_col = 0) print(data.head())
copy

Now, take a look at the example and the explanation below:

import pandas as pd
data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/4bf24830-59ba-4418-969b-aaf8117d522e/cars.csv', index_col = 0)
models = ['HONDA', 'FORD', 'MERCEDES-BENZ', 'HYUNDAI']
data_extracted = data.loc[data['Manufacturer'].isin(models)]
print(data_extracted.head())
12345
import pandas as pd data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/4bf24830-59ba-4418-969b-aaf8117d522e/cars.csv', index_col = 0) models = ['HONDA', 'FORD', 'MERCEDES-BENZ', 'HYUNDAI'] data_extracted = data.loc[data['Manufacturer'].isin(models)] print(data_extracted.head())
copy

Explanation:

If you remember, we always put the conditions inside the .loc[] attribute. Here, we do the same. The .isin(list) method checks if the values from the column are in the array. In our case, we check if values from the column 'Manufacturer' are in the list models.

Tehtävä

Swipe to start coding

Your task here is to extract data about cars where values from the column 'Color' are equal to 'Grey', 'White', 'Black'. Follow the algorithm to easily manage with the task:

  1. Create the colors list with the elements 'Grey', 'White', 'Black' (in this order).
  2. Extract values from the column 'Color' that the list color consists of. Use the .loc[] attribute.
  3. Output the last five rows of the dataset data_extracted.

Ratkaisu

import pandas as pd

data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/4bf24830-59ba-4418-969b-aaf8117d522e/cars.csv', index_col = 0)

# Create a list
colors = ['Grey', 'White', 'Black']
# Extract needed values
data_extracted = data.loc[data['Color'].isin(colors)]

# Output data
print(data_extracted.tail())

Oliko kaikki selvää?

Miten voimme parantaa sitä?

Kiitos palautteestasi!

Osio 3. Luku 1
single

single

import pandas as pd

data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/4bf24830-59ba-4418-969b-aaf8117d522e/cars.csv', index_col = 0)

# Create a list
colors = ___
# Extract needed values
data_extracted = data.loc[data[___].___(___)]

# Output data
print(data_extracted.___)

Kysy tekoälyä

expand

Kysy tekoälyä

ChatGPT

Kysy mitä tahansa tai kokeile jotakin ehdotetuista kysymyksistä aloittaaksesi keskustelumme

some-alt