Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Challenge | Preprocessing Data: Part I
Data Manipulation using pandas

Challenge

You've solved the first problem with wrong column type. Let's solve the remaining one (with dots). Recall that there are 4 columns with wrong types left ('morgh', 'valueh', 'grosrth', 'omphtotinch'). These columns considered to have dots as indicators for 'Not applicable'. For instance, columns valueh and grosrth are mutually exclusive, since the first one indicates the price of dwelling (i.e., house is owned) and the second one indicates the monthly rent.

The most appropriate way to solve this problem is to replace dots by NA values. In that case, we would be able to manipulate column like a numerical one.

Task

Perform a replacement of dot symbols . by NAs for 'morgh', 'valueh', 'grosrth', 'omphtotinch' columns. Follow the next steps:

  1. Import the NumPy library with np alias.
  2. Apply the .where() method to the df dataframe.
  3. Set the condition what values must remain unchanged. These must be non-dots values.
  4. Set the other parameter to nan value from NumPy.

Everything was clear?

Section 1. Chapter 8
toggle bottom row
course content

Course Content

Data Manipulation using pandas

Challenge

You've solved the first problem with wrong column type. Let's solve the remaining one (with dots). Recall that there are 4 columns with wrong types left ('morgh', 'valueh', 'grosrth', 'omphtotinch'). These columns considered to have dots as indicators for 'Not applicable'. For instance, columns valueh and grosrth are mutually exclusive, since the first one indicates the price of dwelling (i.e., house is owned) and the second one indicates the monthly rent.

The most appropriate way to solve this problem is to replace dots by NA values. In that case, we would be able to manipulate column like a numerical one.

Task

Perform a replacement of dot symbols . by NAs for 'morgh', 'valueh', 'grosrth', 'omphtotinch' columns. Follow the next steps:

  1. Import the NumPy library with np alias.
  2. Apply the .where() method to the df dataframe.
  3. Set the condition what values must remain unchanged. These must be non-dots values.
  4. Set the other parameter to nan value from NumPy.

Everything was clear?

Section 1. Chapter 8
toggle bottom row
some-alt