Challenge: Preprocessing Pipeline
Swipe to start coding
You are given the Titanic dataset from the seaborn library.
Your task is to build a complete preprocessing pipeline that performs all essential data transformations used before machine learning.
Follow these steps:
- Load the dataset using
sns.load_dataset("titanic"). - Handle missing values:
- Numeric columns → fill with mean.
- Categorical columns → fill with mode.
- Encode the categorical features
sexandembarkedusingpd.get_dummies(). - Scale numeric columns
ageandfareusingStandardScaler. - Create a new feature
family_size = sibsp + parch + 1. - Combine all transformations into a function called
preprocess_titanic(data)that returns the final processed DataFrame. - Assign the processed dataset to a variable called
processed_data.
Print the first 5 rows of the final DataFrame.
Løsning
Tak for dine kommentarer!
single
Spørg AI
Spørg AI
Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat
Fantastisk!
Completion rate forbedret til 8.33
Challenge: Preprocessing Pipeline
Stryg for at vise menuen
Swipe to start coding
You are given the Titanic dataset from the seaborn library.
Your task is to build a complete preprocessing pipeline that performs all essential data transformations used before machine learning.
Follow these steps:
- Load the dataset using
sns.load_dataset("titanic"). - Handle missing values:
- Numeric columns → fill with mean.
- Categorical columns → fill with mode.
- Encode the categorical features
sexandembarkedusingpd.get_dummies(). - Scale numeric columns
ageandfareusingStandardScaler. - Create a new feature
family_size = sibsp + parch + 1. - Combine all transformations into a function called
preprocess_titanic(data)that returns the final processed DataFrame. - Assign the processed dataset to a variable called
processed_data.
Print the first 5 rows of the final DataFrame.
Løsning
Tak for dine kommentarer!
single