Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Impara Challenge: Implementing a Decision Tree | Decision Tree
Classification with Python

book
Challenge: Implementing a Decision Tree

In this challenge, you will use the titanic dataset. It holds information about passengers on the Titanic, including their age, sex, family size, etc. And the task is to predict whether a person survived or not.

import pandas as pd

df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/titanic.csv')
print(df.head())
1234
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/titanic.csv') print(df.head())
copy

To implement the Decision Tree, you can use the DecisionTreeClassifier from the sklearn.

Your task is to build a Decision Tree and find the best max_depth and min_samples_leaf using grid search.

Compito

Swipe to start coding

  1. Import the DecisionTreeClassifier class from sklearn.tree.
  2. Assign an instance of DecisionTreeClassifier to the decision_tree variable.
  3. Create a dictionary for a GridSearchCV to run through [1, 2, 3, 4, 5, 6, 7] values of max_depth and [1, 2, 4, 6] values of min_samples_leaf.
  4. Create a GridSearchCV object and train it.

Soluzione

import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV
# Read the data and assign the variables
df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/titanic.csv')
X = df.drop(columns=['Survived'])
y = df['Survived']

decision_tree = DecisionTreeClassifier()
param_grid = {'max_depth': [1, 2, 3, 4, 5, 6, 7], 'min_samples_leaf': [1, 2, 4, 6]}
# Use `GridSearchCV` to find the best parameters
grid = GridSearchCV(decision_tree, param_grid, cv=10).fit(X, y)
# Print the best estimator and score
print(grid.best_estimator_)
print(grid.best_score_)

Tutto è chiaro?

Come possiamo migliorarlo?

Grazie per i tuoi commenti!

Sezione 3. Capitolo 4
import pandas as pd
from sklearn.tree import ___
from sklearn.model_selection import GridSearchCV
# Read the data and assign the variables
df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/titanic.csv')
X = df.drop(columns=['Survived'])
y = df['Survived']

decision_tree = ___()
param_grid = {'max_depth': [1, 2, 3, 4, 5, 6, 7], '___': [1, 2, 4, 6]}
# Use `GridSearchCV` to find the best parameters
grid = GridSearchCV(decision_tree, param_grid, cv=10).___(X, y)
# Print the best estimator and score
print(grid.best_estimator_)
print(grid.best_score_)

Chieda ad AI

expand
ChatGPT

Chieda pure quello che desidera o provi una delle domande suggerite per iniziare la nostra conversazione

some-alt