Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Challenge: Implementing a Random Forest | Random Forest
Classification with Python

book
Challenge: Implementing a Random Forest

In this chapter, you will build a Random Forest using the same titanic dataset.

Also, you will calculate the cross-validation accuracy using the cross_val_score() function

In the end, you will print the feature importances.
The feature_importances_ attribute only holds an array with importances without specifying the name of a feature.
To print the pairs ('name', importance), you can use the following syntax:

python
for f in zip(X.columns, model.feature_importances_):
print(f)
Task

Swipe to start coding

  1. Import the RandomForestClassifier class.
  2. Create an instance of a RandomForestClassifier class with default parameters and train it.
  3. Print the cross-validation score with the cv=10 of a random_forest you just built.
  4. Print each feature's importance along with its name.

Solution

import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score
# Read the data and assign the variables
df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/titanic.csv')
X = df.drop('Survived', axis=1)
y = df['Survived']
# Build and train a Random Forest
random_forest = RandomForestClassifier().fit(X, y)
# Print the cross-validation accuracy
print(cross_val_score(random_forest, X, y, cv=10).mean())

for feature in zip(X.columns, random_forest.feature_importances_):
print(feature)

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 4. Chapter 3
import pandas as pd
from sklearn.ensemble import ___
from sklearn.model_selection import cross_val_score
# Read the data and assign the variables
df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b71ff7ac-3932-41d2-a4d8-060e24b00129/titanic.csv')
X = df.drop('Survived', axis=1)
y = df['Survived']
# Build and train a Random Forest
random_forest = ___().___(X, y)
# Print the cross-validation accuracy
print(cross_val_score(___, ___, ___, cv=10).mean())

for feature in zip(X.columns, random_forest.___):
print(feature)

Ask AI

expand
ChatGPT

Ask anything or try one of the suggested questions to begin our chat

some-alt