Course Content

Ensemble Learning

1. Basic Principles of Building Ensemble Models

What is Ensemble of Models?Bagging Models Boosting Models Stacking Models

2. Commonly Used Bagging Models

Bagging Classifier Challenge: Solving Task Using Bagging Classifier Bagging Regressor Challenge: Solving Task Using Bagging Regressor Random Forest Challenge: Determining Feature Importances Using Random Forest ExtraTrees

3. Commonly Used Boosting Models

AdaBoost Classifier Challenge: Solving Task Using AdaBoost Classifier Challenge: Solving Task Using AdaBoost Regressor Gradient Boosting XGBoost Challenge: Solving Task Using XGBoost

4. Commonly Used Stacking Models

Stacking Classifier Challenge: Solving Task Using Stacking Classifier Challenge: Solving Task Using Stacking Regressor Using Ensembles As Base Models Course Summary

Gradient Boosting

Gradient Boosting is a powerful boosting ensemble learning technique for classification and regression tasks.

How does Gradient Boosting work?

Base Model Initialization: The process starts with initializing a base model as the first weak learner. This initial model makes predictions, but they may not be very accurate;
Residual Calculation: The difference between the actual target values and the predictions of the current model is calculated. These differences, known as residuals or errors, represent the "residuals" the next model will try to correct;
New Model Fitting: A new weak learner is fitted to predict the residuals from the previous step. This new model aims to correct the mistakes made by the previous model;
Combining Predictions: The new model's predictions are added to the predictions of the previous model. The combined predictions start to approximate the actual target values more closely;
Iterative Process: Steps 3 and 4 are repeated for a specified number of iterations (or until a stopping criterion is met). In each iteration, a new model is fit to predict the residuals of the combined predictions from previous iterations;
Final Prediction: After completing all iterations, the final prediction is obtained by adding the weak learners' predictions together. This ensemble of models forms a strong learner that has learned to correct the errors of the previous models.

Note

We can also calculate feature importance using the trained model's .feature_importances_ attribute.

Example

Note

It's important to notice that GradientBoostingRegressor and GradientBoostingClassifier classes in Python are designed to use only DecisionTreeRegressor and DecisionTreeClassifier as base models of an ensemble!


              1234567891011121314151617181920212223242526
            
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import f1_score
import seaborn as sns

# Load the Iris dataset
data = load_iris()
X = data.data
y = data.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and train the Gradient Boosting classifier
clf = GradientBoostingClassifier(n_estimators=100)
clf.fit(X_train, y_train)

# Make predictions
y_pred = clf.predict(X_test)

# Calculate accuracy
f1 = f1_score(y_test, y_pred, average='weighted')
print(f'F1 score: {f1:.4f}')

Everything was clear?

Thanks for your feedback!

Section 3. Chapter 4

Ask AI

Ask anything or try one of the suggested questions to begin our chat