Challenge: Creating a Complete ML Pipeline
Now create a pipeline that includes a final estimator. This produces a trained prediction pipeline that can generate predictions for new instances using the .predict() method.
Since a predictor requires the target variable y, encode it separately from the pipeline built for X. Use LabelEncoder to encode the target.
Since the predictions are encoded as 0, 1, or 2, the .inverse_transform() method of LabelEncoder can be used to convert them back to the original labels: 'Adelie', 'Chinstrap', or 'Gentoo'.
Swipe to start coding
You are given a DataFrame named df that contains penguin data.
Your task is to build and train a complete machine learning pipeline that preprocesses the data and applies a KNeighborsClassifier model.
- Encode the target variable
yusing theLabelEncoderclass. - Create a
ColumnTransformernamedctthat applies aOneHotEncoderto the'island'and'sex'columns, while leaving the other columns unchanged (remainder='passthrough'). - Create a pipeline that includes the following steps in order:
- The
ColumnTransformeryou defined (ct); - A
SimpleImputerwith thestrategyparameter set to'most_frequent'; - A
StandardScalerfor feature scaling; - A
KNeighborsClassifieras the final model.
- The
- Train the pipeline on the features
Xand targety. - Generate predictions for
Xusing the trained pipeline and print the decoded class names.
Solution
Thanks for your feedback!
single
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
How do I use LabelEncoder to encode the target variable?
Can you show me how to add a final estimator to the pipeline?
What is the purpose of encoding the target variable separately from the features?
Awesome!
Completion rate improved to 3.13
Challenge: Creating a Complete ML Pipeline
Swipe to show menu
Now create a pipeline that includes a final estimator. This produces a trained prediction pipeline that can generate predictions for new instances using the .predict() method.
Since a predictor requires the target variable y, encode it separately from the pipeline built for X. Use LabelEncoder to encode the target.
Since the predictions are encoded as 0, 1, or 2, the .inverse_transform() method of LabelEncoder can be used to convert them back to the original labels: 'Adelie', 'Chinstrap', or 'Gentoo'.
Swipe to start coding
You are given a DataFrame named df that contains penguin data.
Your task is to build and train a complete machine learning pipeline that preprocesses the data and applies a KNeighborsClassifier model.
- Encode the target variable
yusing theLabelEncoderclass. - Create a
ColumnTransformernamedctthat applies aOneHotEncoderto the'island'and'sex'columns, while leaving the other columns unchanged (remainder='passthrough'). - Create a pipeline that includes the following steps in order:
- The
ColumnTransformeryou defined (ct); - A
SimpleImputerwith thestrategyparameter set to'most_frequent'; - A
StandardScalerfor feature scaling; - A
KNeighborsClassifieras the final model.
- The
- Train the pipeline on the features
Xand targety. - Generate predictions for
Xusing the trained pipeline and print the decoded class names.
Solution
Thanks for your feedback!
single