Challenge: Encoding Categorical Variables
To summarize the previous three chapters, here is a table showing what encoder you should use:
In this challenge, the penguins dataset (without missing values) is provided. All categorical features, including the target ('species' column), must be encoded.
Here is a reminder of the dataset structure:
12345import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a65bbc96-309e-4df9-a790-a1eb8c815a1c/penguins_imputed.csv') print(df.head())
Keep in mind that 'island' and 'sex' are categorical features and 'species' is a categorical target.
Swipe to start coding
You are given a DataFrame named df that contains penguin data.
Your task is to encode all categorical features so that the data can be used in a machine learning model.
- Import the
OneHotEncoderandLabelEncoderclasses fromsklearn.preprocessing. - Separate the feature matrix
Xand the target variableyfrom theDataFrame. - Create a
OneHotEncoderobject and apply it to the'island'and'sex'columns inX. - Replace the original categorical columns with the encoded ones.
- Create a
LabelEncoderobject and apply it to the'species'column to encode the target variabley.
Solution
Thanks for your feedback!
single
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
Awesome!
Completion rate improved to 3.13
Challenge: Encoding Categorical Variables
Swipe to show menu
To summarize the previous three chapters, here is a table showing what encoder you should use:
In this challenge, the penguins dataset (without missing values) is provided. All categorical features, including the target ('species' column), must be encoded.
Here is a reminder of the dataset structure:
12345import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a65bbc96-309e-4df9-a790-a1eb8c815a1c/penguins_imputed.csv') print(df.head())
Keep in mind that 'island' and 'sex' are categorical features and 'species' is a categorical target.
Swipe to start coding
You are given a DataFrame named df that contains penguin data.
Your task is to encode all categorical features so that the data can be used in a machine learning model.
- Import the
OneHotEncoderandLabelEncoderclasses fromsklearn.preprocessing. - Separate the feature matrix
Xand the target variableyfrom theDataFrame. - Create a
OneHotEncoderobject and apply it to the'island'and'sex'columns inX. - Replace the original categorical columns with the encoded ones.
- Create a
LabelEncoderobject and apply it to the'species'column to encode the target variabley.
Solution
Thanks for your feedback!
single