Aprende Challenge: Classifying Unseparateble Data

In this Challenge, you are given the following dataset:


              1234
            
import pandas as pd

df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv')
print(df.head())

Here is its plot.


              12345
            
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv')
plt.scatter(df['X1'], df['X2'], c=df['y'])

The dataset is for sure not linearly separable. Let's look at the Logistic Regression performance:


              123456789101112
            
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_val_score

df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv')
X = df[['X1', 'X2']]
y = df['y']

X = StandardScaler().fit_transform(X)
lr = LogisticRegression().fit(X, y)
print(cross_val_score(lr, X, y).mean())

The result is awful. Regular Logistic Regression is not suited for this task. Your task is to check whether the PolynomialFeatures will help. To find the best C parameter, you will use the GridSearchCV class.

In this challenge, the Pipeline is used. You can think of it as a list of preprocessing steps. Its .fit_transform() method sequentially applies .fit_transform() to each item.

Tarea

Swipe to start coding

Build a Logistic Regression model with polynomial features and find the best C parameter using GridSearchCV

Create a pipeline to make an X_poly variable that will hold the polynomial features of degree 2 of X and be scaled.
Create a param_grid dictionary to tell the GridSearchCV you want to try values [0.01, 0.1, 1, 10, 100] of a C parameter.
Initialize and train a GridSearchCV object.

Solución

Cambia al escritorio para practicar en el mundo realContinúe desde donde se encuentra utilizando una de las siguientes opciones

¿Todo estuvo claro?

¡Gracias por tus comentarios!

Sección 2. Capítulo 6

single

Pregunte a AI

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla

In this Challenge, you are given the following dataset:


              1234
            
import pandas as pd

df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv')
print(df.head())

Here is its plot.


              12345
            
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv')
plt.scatter(df['X1'], df['X2'], c=df['y'])

The dataset is for sure not linearly separable. Let's look at the Logistic Regression performance:


              123456789101112
            
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_val_score

df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv')
X = df[['X1', 'X2']]
y = df['y']

X = StandardScaler().fit_transform(X)
lr = LogisticRegression().fit(X, y)
print(cross_val_score(lr, X, y).mean())

In this challenge, the Pipeline is used. You can think of it as a list of preprocessing steps. Its .fit_transform() method sequentially applies .fit_transform() to each item.

Tarea

Swipe to start coding

Build a Logistic Regression model with polynomial features and find the best C parameter using GridSearchCV

Create a pipeline to make an X_poly variable that will hold the polynomial features of degree 2 of X and be scaled.
Create a param_grid dictionary to tell the GridSearchCV you want to try values [0.01, 0.1, 1, 10, 100] of a C parameter.
Initialize and train a GridSearchCV object.

Solución

Cambia al escritorio para practicar en el mundo realContinúe desde donde se encuentra utilizando una de las siguientes opciones

¿Todo estuvo claro?

¡Gracias por tus comentarios!

Sección 2. Capítulo 6

single

Desliza para mostrar el menú

In this Challenge, you are given the following dataset:


              1234
            
import pandas as pd

df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv')
print(df.head())

Here is its plot.


              12345
            
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv')
plt.scatter(df['X1'], df['X2'], c=df['y'])

The dataset is for sure not linearly separable. Let's look at the Logistic Regression performance:


              123456789101112
            
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_val_score

df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/circles.csv')
X = df[['X1', 'X2']]
y = df['y']

X = StandardScaler().fit_transform(X)
lr = LogisticRegression().fit(X, y)
print(cross_val_score(lr, X, y).mean())

In this challenge, the Pipeline is used. You can think of it as a list of preprocessing steps. Its .fit_transform() method sequentially applies .fit_transform() to each item.

Tarea

Swipe to start coding

Build a Logistic Regression model with polynomial features and find the best C parameter using GridSearchCV

Create a pipeline to make an X_poly variable that will hold the polynomial features of degree 2 of X and be scaled.
Create a param_grid dictionary to tell the GridSearchCV you want to try values [0.01, 0.1, 1, 10, 100] of a C parameter.
Initialize and train a GridSearchCV object.

Solución

Cambia al escritorio para practicar en el mundo realContinúe desde donde se encuentra utilizando una de las siguientes opciones

¿Todo estuvo claro?

¡Gracias por tus comentarios!

Challenge: Classifying Unseparateble Data

Solución

Awesome!

Challenge: Classifying Unseparateble Data

Solución

Awesome!

Challenge: Classifying Unseparateble Data

Solución