Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer Challenge | Preprocessing Data: Part I
Data Manipulation using pandas

Veeg om het menu te tonen

book
Challenge

You've solved the first problem with wrong column type. Let's solve the remaining one (with dots). Recall that there are 4 columns with wrong types left ('morgh', 'valueh', 'grosrth', 'omphtotinch'). These columns considered to have dots as indicators for 'Not applicable'. For instance, columns valueh and grosrth are mutually exclusive, since the first one indicates the price of dwelling (i.e., house is owned) and the second one indicates the monthly rent.

The most appropriate way to solve this problem is to replace dots by NA values. In that case, we would be able to manipulate column like a numerical one.

Taak

Swipe to start coding

Perform a replacement of dot symbols . by NAs for 'morgh', 'valueh', 'grosrth', 'omphtotinch' columns. Follow the next steps:

  1. Import the NumPy library with np alias.
  2. Apply the .where() method to the df dataframe.
  3. Set the condition what values must remain unchanged. These must be non-dots values.
  4. Set the other parameter to nan value from NumPy.

Oplossing

Switch to desktopSchakel over naar desktop voor praktijkervaringGa verder vanaf waar je bent met een van de onderstaande opties
Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 1. Hoofdstuk 8

Vraag AI

expand
ChatGPT

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

book
Challenge

You've solved the first problem with wrong column type. Let's solve the remaining one (with dots). Recall that there are 4 columns with wrong types left ('morgh', 'valueh', 'grosrth', 'omphtotinch'). These columns considered to have dots as indicators for 'Not applicable'. For instance, columns valueh and grosrth are mutually exclusive, since the first one indicates the price of dwelling (i.e., house is owned) and the second one indicates the monthly rent.

The most appropriate way to solve this problem is to replace dots by NA values. In that case, we would be able to manipulate column like a numerical one.

Taak

Swipe to start coding

Perform a replacement of dot symbols . by NAs for 'morgh', 'valueh', 'grosrth', 'omphtotinch' columns. Follow the next steps:

  1. Import the NumPy library with np alias.
  2. Apply the .where() method to the df dataframe.
  3. Set the condition what values must remain unchanged. These must be non-dots values.
  4. Set the other parameter to nan value from NumPy.

Oplossing

Switch to desktopSchakel over naar desktop voor praktijkervaringGa verder vanaf waar je bent met een van de onderstaande opties
Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 1. Hoofdstuk 8
Switch to desktopSchakel over naar desktop voor praktijkervaringGa verder vanaf waar je bent met een van de onderstaande opties
Onze excuses dat er iets mis is gegaan. Wat is er gebeurd?
some-alt