Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprende Challenge: Preprocessing Pipeline | Section
Practice
Projects
Quizzes & Challenges
Cuestionarios
Challenges
/
Data Preprocessing and Feature Engineering

bookChallenge: Preprocessing Pipeline

Tarea

Swipe to start coding

You are given the Titanic dataset from the seaborn library. Your task is to build a complete preprocessing pipeline that performs all essential data transformations used before machine learning.

Follow these steps:

  1. Load the dataset using sns.load_dataset("titanic").
  2. Handle missing values:
    • Numeric columns → fill with mean.
    • Categorical columns → fill with mode.
  3. Encode the categorical features sex and embarked using pd.get_dummies().
  4. Scale numeric columns age and fare using StandardScaler.
  5. Create a new feature family_size = sibsp + parch + 1.
  6. Combine all transformations into a function called preprocess_titanic(data) that returns the final processed DataFrame.
  7. Assign the processed dataset to a variable called processed_data.

Print the first 5 rows of the final DataFrame.

Solución

¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 1. Capítulo 12
single

single

Pregunte a AI

expand

Pregunte a AI

ChatGPT

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla

close

bookChallenge: Preprocessing Pipeline

Desliza para mostrar el menú

Tarea

Swipe to start coding

You are given the Titanic dataset from the seaborn library. Your task is to build a complete preprocessing pipeline that performs all essential data transformations used before machine learning.

Follow these steps:

  1. Load the dataset using sns.load_dataset("titanic").
  2. Handle missing values:
    • Numeric columns → fill with mean.
    • Categorical columns → fill with mode.
  3. Encode the categorical features sex and embarked using pd.get_dummies().
  4. Scale numeric columns age and fare using StandardScaler.
  5. Create a new feature family_size = sibsp + parch + 1.
  6. Combine all transformations into a function called preprocess_titanic(data) that returns the final processed DataFrame.
  7. Assign the processed dataset to a variable called processed_data.

Print the first 5 rows of the final DataFrame.

Solución

Switch to desktopCambia al escritorio para practicar en el mundo realContinúe desde donde se encuentra utilizando una de las siguientes opciones
¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 1. Capítulo 12
single

single

some-alt