Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Вивчайте Challenge: Classifying Unseparateble Data | Logistic Regression
Classification with Python

Свайпніть щоб показати меню

book
Challenge: Classifying Unseparateble Data

In this Challenge, you are given the following dataset:

1234
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv') print(df.head())
copy

Here is its plot.

12345
import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv') plt.scatter(df['X1'], df['X2'], c=df['y'])
copy

The dataset is for sure not linearly separable. Let's look at the Logistic Regression performance:

123456789101112
import pandas as pd from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression from sklearn.model_selection import cross_val_score df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv') X = df[['X1', 'X2']] y = df['y'] X = StandardScaler().fit_transform(X) lr = LogisticRegression().fit(X, y) print(cross_val_score(lr, X, y).mean())
copy

The result is awful. Regular Logistic Regression is not suited for this task. Your task is to check whether the PolynomialFeatures will help. To find the best C parameter, you will use the GridSearchCV class.

In this challenge, the Pipeline is used. You can think of it as a list of preprocessing steps. Its .fit_transform() method sequentially applies .fit_transform() to each item.

Завдання

Swipe to start coding

Build a Logistic Regression model with polynomial features and find the best C parameter using GridSearchCV

  1. Create a pipeline to make an X_poly variable that will hold the polynomial features of degree 2 of X and be scaled.
  2. Create a param_grid dictionary to tell the GridSearchCV you want to try values [0.01, 0.1, 1, 10, 100] of a C parameter.
  3. Initialize and train a GridSearchCV object.

Рішення

Switch to desktopПерейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 2. Розділ 6
single

single

Запитати АІ

expand

Запитати АІ

ChatGPT

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

close

Awesome!

Completion rate improved to 3.57

book
Challenge: Classifying Unseparateble Data

In this Challenge, you are given the following dataset:

1234
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv') print(df.head())
copy

Here is its plot.

12345
import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv') plt.scatter(df['X1'], df['X2'], c=df['y'])
copy

The dataset is for sure not linearly separable. Let's look at the Logistic Regression performance:

123456789101112
import pandas as pd from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression from sklearn.model_selection import cross_val_score df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv') X = df[['X1', 'X2']] y = df['y'] X = StandardScaler().fit_transform(X) lr = LogisticRegression().fit(X, y) print(cross_val_score(lr, X, y).mean())
copy

The result is awful. Regular Logistic Regression is not suited for this task. Your task is to check whether the PolynomialFeatures will help. To find the best C parameter, you will use the GridSearchCV class.

In this challenge, the Pipeline is used. You can think of it as a list of preprocessing steps. Its .fit_transform() method sequentially applies .fit_transform() to each item.

Завдання

Swipe to start coding

Build a Logistic Regression model with polynomial features and find the best C parameter using GridSearchCV

  1. Create a pipeline to make an X_poly variable that will hold the polynomial features of degree 2 of X and be scaled.
  2. Create a param_grid dictionary to tell the GridSearchCV you want to try values [0.01, 0.1, 1, 10, 100] of a C parameter.
  3. Initialize and train a GridSearchCV object.

Рішення

Switch to desktopПерейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

close

Awesome!

Completion rate improved to 3.57

Свайпніть щоб показати меню

some-alt