Course Content
Data Preprocessing
Data Preprocessing
Challenge 1
Swipe to show code editor
In this challenge, you will need to work with the 'adult-census.csv'
dataset. It contains both categorical and numerical data. Your task will be to prepare the data for processing.
- Read the dataset
'adult-census.csv'
- Explore the dataset. Carefully check which character indicates the missed data in the dataset and replace it with the
np.nan
object - Remove rows with missing values
- Let's start with processing categorical data - columns
'workclass'
,'sex'
Use a one-hot encoding method to encode them - For numeric data (
'age'
,'hours-per-week'
), you will need to scale the data - Print processed data
Thanks for your feedback!
Challenge 1
Swipe to show code editor
In this challenge, you will need to work with the 'adult-census.csv'
dataset. It contains both categorical and numerical data. Your task will be to prepare the data for processing.
- Read the dataset
'adult-census.csv'
- Explore the dataset. Carefully check which character indicates the missed data in the dataset and replace it with the
np.nan
object - Remove rows with missing values
- Let's start with processing categorical data - columns
'workclass'
,'sex'
Use a one-hot encoding method to encode them - For numeric data (
'age'
,'hours-per-week'
), you will need to scale the data - Print processed data
Thanks for your feedback!
Challenge 1
Swipe to show code editor
In this challenge, you will need to work with the 'adult-census.csv'
dataset. It contains both categorical and numerical data. Your task will be to prepare the data for processing.
- Read the dataset
'adult-census.csv'
- Explore the dataset. Carefully check which character indicates the missed data in the dataset and replace it with the
np.nan
object - Remove rows with missing values
- Let's start with processing categorical data - columns
'workclass'
,'sex'
Use a one-hot encoding method to encode them - For numeric data (
'age'
,'hours-per-week'
), you will need to scale the data - Print processed data
Thanks for your feedback!
Swipe to show code editor
In this challenge, you will need to work with the 'adult-census.csv'
dataset. It contains both categorical and numerical data. Your task will be to prepare the data for processing.
- Read the dataset
'adult-census.csv'
- Explore the dataset. Carefully check which character indicates the missed data in the dataset and replace it with the
np.nan
object - Remove rows with missing values
- Let's start with processing categorical data - columns
'workclass'
,'sex'
Use a one-hot encoding method to encode them - For numeric data (
'age'
,'hours-per-week'
), you will need to scale the data - Print processed data