Challenge: Build a Preprocessing Pipeline
Tâche
Swipe to start coding
You're given a small mixed-type dataset. Build a leakage-safe preprocessing + model pipeline with scikit-learn:
- Split data into X (features) and y (target), then do a train/test split (
test_size=0.3,random_state=42). - Create a ColumnTransformer named
preprocess:- numeric columns →
StandardScaler() - categorical columns →
OneHotEncoder(handle_unknown="ignore")
- numeric columns →
- Build a Pipeline named
pipewith steps:("preprocess", preprocess)("clf", LogisticRegression(max_iter=1000, random_state=0))
- Fit on train only, then predict on test:
- compute
y_predandtest_accuracy = accuracy_score(y_test, y_pred)
- compute
- Add a few prints at the end to show shapes and the accuracy.
Solution
Tout était clair ?
Merci pour vos commentaires !
Section 5. Chapitre 3
single
Demandez à l'IA
Demandez à l'IA
Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion
Awesome!
Completion rate improved to 5.26
Challenge: Build a Preprocessing Pipeline
Glissez pour afficher le menu
Tâche
Swipe to start coding
You're given a small mixed-type dataset. Build a leakage-safe preprocessing + model pipeline with scikit-learn:
- Split data into X (features) and y (target), then do a train/test split (
test_size=0.3,random_state=42). - Create a ColumnTransformer named
preprocess:- numeric columns →
StandardScaler() - categorical columns →
OneHotEncoder(handle_unknown="ignore")
- numeric columns →
- Build a Pipeline named
pipewith steps:("preprocess", preprocess)("clf", LogisticRegression(max_iter=1000, random_state=0))
- Fit on train only, then predict on test:
- compute
y_predandtest_accuracy = accuracy_score(y_test, y_pred)
- compute
- Add a few prints at the end to show shapes and the accuracy.
Solution
Tout était clair ?
Merci pour vos commentaires !
Section 5. Chapitre 3
single