Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Challenge: Putting It All Together | Modeling
Introduction to Machine Learning with Python

bookChallenge: Putting It All Together

In this challenge, apply the full workflow learned in the course — from data preprocessing through training to model evaluation.

Task

Swipe to start coding

You are working with a penguin dataset. Build an ML pipeline to classify species with KNN, handling encoding, missing values, scaling, and tuning.

  1. Encode y with LabelEncoder.
  2. Split with train_test_split(test_size=0.33).
  3. Make ct: OneHotEncoder on 'island', 'sex', remainder='passthrough'.
  4. Set param_grid for n_neighbors, weights, p.
  5. Create GridSearchCV(KNeighborsClassifier(), param_grid).
  6. Pipeline: ctSimpleImputer('most_frequent')StandardScalerGridSearchCV.
  7. Fit on train.
  8. Print test .score.
  9. Predict, print first 5 decoded labels.
  10. Print .best_estimator_.

Solution

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 4. Chapter 10
single

single

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain the steps involved in the full workflow shown here?

What is the purpose of each tool or method depicted in the images?

Can you provide a summary of how these components work together in a machine learning project?

close

bookChallenge: Putting It All Together

Swipe to show menu

In this challenge, apply the full workflow learned in the course — from data preprocessing through training to model evaluation.

Task

Swipe to start coding

You are working with a penguin dataset. Build an ML pipeline to classify species with KNN, handling encoding, missing values, scaling, and tuning.

  1. Encode y with LabelEncoder.
  2. Split with train_test_split(test_size=0.33).
  3. Make ct: OneHotEncoder on 'island', 'sex', remainder='passthrough'.
  4. Set param_grid for n_neighbors, weights, p.
  5. Create GridSearchCV(KNeighborsClassifier(), param_grid).
  6. Pipeline: ctSimpleImputer('most_frequent')StandardScalerGridSearchCV.
  7. Fit on train.
  8. Print test .score.
  9. Predict, print first 5 decoded labels.
  10. Print .best_estimator_.

Solution

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 4. Chapter 10
single

single

some-alt