Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Impara Challenge: Preprocessing Pipeline | Section
Data Preprocessing and Feature Engineering

bookChallenge: Preprocessing Pipeline

Compito

Swipe to start coding

You are given the Titanic dataset from the seaborn library. Your task is to build a complete preprocessing pipeline that performs all essential data transformations used before machine learning.

Follow these steps:

  1. Load the dataset using sns.load_dataset("titanic").
  2. Handle missing values:
    • Numeric columns → fill with mean.
    • Categorical columns → fill with mode.
  3. Encode the categorical features sex and embarked using pd.get_dummies().
  4. Scale numeric columns age and fare using StandardScaler.
  5. Create a new feature family_size = sibsp + parch + 1.
  6. Combine all transformations into a function called preprocess_titanic(data) that returns the final processed DataFrame.
  7. Assign the processed dataset to a variable called processed_data.

Print the first 5 rows of the final DataFrame.

Soluzione

Tutto è chiaro?

Come possiamo migliorarlo?

Grazie per i tuoi commenti!

Sezione 1. Capitolo 12
single

single

Chieda ad AI

expand

Chieda ad AI

ChatGPT

Chieda pure quello che desidera o provi una delle domande suggerite per iniziare la nostra conversazione

close

bookChallenge: Preprocessing Pipeline

Scorri per mostrare il menu

Compito

Swipe to start coding

You are given the Titanic dataset from the seaborn library. Your task is to build a complete preprocessing pipeline that performs all essential data transformations used before machine learning.

Follow these steps:

  1. Load the dataset using sns.load_dataset("titanic").
  2. Handle missing values:
    • Numeric columns → fill with mean.
    • Categorical columns → fill with mode.
  3. Encode the categorical features sex and embarked using pd.get_dummies().
  4. Scale numeric columns age and fare using StandardScaler.
  5. Create a new feature family_size = sibsp + parch + 1.
  6. Combine all transformations into a function called preprocess_titanic(data) that returns the final processed DataFrame.
  7. Assign the processed dataset to a variable called processed_data.

Print the first 5 rows of the final DataFrame.

Soluzione

Switch to desktopCambia al desktop per esercitarti nel mondo realeContinua da dove ti trovi utilizzando una delle opzioni seguenti
Tutto è chiaro?

Come possiamo migliorarlo?

Grazie per i tuoi commenti!

Sezione 1. Capitolo 12
single

single

some-alt