In this Challenge, you are given the following dataset:

Here is its plot.

The dataset is for sure not linearly separable. Let's look at the Logistic Regression performance:

The result is awful. Regular Logistic Regression is not suited for this task. Your task is to check whether the PolynomialFeatures will help. To find the best C parameter, you will use the GridSearchCV class.

In this challenge, the Pipeline is used. You can think of it as a list of preprocessing steps. Its .fit_transform() method sequentially applies .fit_transform() to each item.


Build a Logistic Regression model with polynomial features and find the best C parameter using GridSearchCV

  1. Create a pipeline to make an X_poly variable that will hold the polynomial features of degree 2 of X and be scaled.
  2. Create a param_grid dictionary to tell the GridSearchCV you want to try values [0.01, 0.1, 1, 10, 100] of a C parameter.
  3. Initialize and train a GridSearchCV object.

Everything was clear?

Section 2. Chapter 6
toggle bottom row