Challenge: Build a Preprocessing Pipeline
Tarefa
Swipe to start coding
You're given a small mixed-type dataset. Build a leakage-safe preprocessing + model pipeline with scikit-learn:
- Split data into X (features) and y (target), then do a train/test split (
test_size=0.3,random_state=42). - Create a ColumnTransformer named
preprocess:- numeric columns →
StandardScaler() - categorical columns →
OneHotEncoder(handle_unknown="ignore")
- numeric columns →
- Build a Pipeline named
pipewith steps:("preprocess", preprocess)("clf", LogisticRegression(max_iter=1000, random_state=0))
- Fit on train only, then predict on test:
- compute
y_predandtest_accuracy = accuracy_score(y_test, y_pred)
- compute
- Add a few prints at the end to show shapes and the accuracy.
Solução
Tudo estava claro?
Obrigado pelo seu feedback!
Seção 5. Capítulo 3
single
Pergunte à IA
Pergunte à IA
Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo
Suggested prompts:
Can you explain this in simpler terms?
What are the main points I should remember?
Can you give me an example?
Awesome!
Completion rate improved to 5.26
Challenge: Build a Preprocessing Pipeline
Deslize para mostrar o menu
Tarefa
Swipe to start coding
You're given a small mixed-type dataset. Build a leakage-safe preprocessing + model pipeline with scikit-learn:
- Split data into X (features) and y (target), then do a train/test split (
test_size=0.3,random_state=42). - Create a ColumnTransformer named
preprocess:- numeric columns →
StandardScaler() - categorical columns →
OneHotEncoder(handle_unknown="ignore")
- numeric columns →
- Build a Pipeline named
pipewith steps:("preprocess", preprocess)("clf", LogisticRegression(max_iter=1000, random_state=0))
- Fit on train only, then predict on test:
- compute
y_predandtest_accuracy = accuracy_score(y_test, y_pred)
- compute
- Add a few prints at the end to show shapes and the accuracy.
Solução
Tudo estava claro?
Obrigado pelo seu feedback!
Seção 5. Capítulo 3
single