Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Oppiskele Challenge: Classifying Unseparateble Data | Logistic Regression
Classification with Python

Pyyhkäise näyttääksesi valikon

book
Challenge: Classifying Unseparateble Data

In this Challenge, you are given the following dataset:

1234
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv') print(df.head())
copy

Here is its plot.

12345
import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv') plt.scatter(df['X1'], df['X2'], c=df['y'])
copy

The dataset is for sure not linearly separable. Let's look at the Logistic Regression performance:

123456789101112
import pandas as pd from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression from sklearn.model_selection import cross_val_score df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv') X = df[['X1', 'X2']] y = df['y'] X = StandardScaler().fit_transform(X) lr = LogisticRegression().fit(X, y) print(cross_val_score(lr, X, y).mean())
copy

The result is awful. Regular Logistic Regression is not suited for this task. Your task is to check whether the PolynomialFeatures will help. To find the best C parameter, you will use the GridSearchCV class.

In this challenge, the Pipeline is used. You can think of it as a list of preprocessing steps. Its .fit_transform() method sequentially applies .fit_transform() to each item.

Tehtävä

Swipe to start coding

Build a Logistic Regression model with polynomial features and find the best C parameter using GridSearchCV

  1. Create a pipeline to make an X_poly variable that will hold the polynomial features of degree 2 of X and be scaled.
  2. Create a param_grid dictionary to tell the GridSearchCV you want to try values [0.01, 0.1, 1, 10, 100] of a C parameter.
  3. Initialize and train a GridSearchCV object.

Ratkaisu

Switch to desktopVaihda työpöytään todellista harjoitusta vartenJatka siitä, missä olet käyttämällä jotakin alla olevista vaihtoehdoista
Oliko kaikki selvää?

Miten voimme parantaa sitä?

Kiitos palautteestasi!

Osio 2. Luku 6
single

single

Kysy tekoälyä

expand

Kysy tekoälyä

ChatGPT

Kysy mitä tahansa tai kokeile jotakin ehdotetuista kysymyksistä aloittaaksesi keskustelumme

close

Awesome!

Completion rate improved to 3.57

book
Challenge: Classifying Unseparateble Data

In this Challenge, you are given the following dataset:

1234
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv') print(df.head())
copy

Here is its plot.

12345
import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv') plt.scatter(df['X1'], df['X2'], c=df['y'])
copy

The dataset is for sure not linearly separable. Let's look at the Logistic Regression performance:

123456789101112
import pandas as pd from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression from sklearn.model_selection import cross_val_score df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv') X = df[['X1', 'X2']] y = df['y'] X = StandardScaler().fit_transform(X) lr = LogisticRegression().fit(X, y) print(cross_val_score(lr, X, y).mean())
copy

The result is awful. Regular Logistic Regression is not suited for this task. Your task is to check whether the PolynomialFeatures will help. To find the best C parameter, you will use the GridSearchCV class.

In this challenge, the Pipeline is used. You can think of it as a list of preprocessing steps. Its .fit_transform() method sequentially applies .fit_transform() to each item.

Tehtävä

Swipe to start coding

Build a Logistic Regression model with polynomial features and find the best C parameter using GridSearchCV

  1. Create a pipeline to make an X_poly variable that will hold the polynomial features of degree 2 of X and be scaled.
  2. Create a param_grid dictionary to tell the GridSearchCV you want to try values [0.01, 0.1, 1, 10, 100] of a C parameter.
  3. Initialize and train a GridSearchCV object.

Ratkaisu

Switch to desktopVaihda työpöytään todellista harjoitusta vartenJatka siitä, missä olet käyttämällä jotakin alla olevista vaihtoehdoista
Oliko kaikki selvää?

Miten voimme parantaa sitä?

Kiitos palautteestasi!

close

Awesome!

Completion rate improved to 3.57

Pyyhkäise näyttääksesi valikon

some-alt