Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Gradient Boosting | Commonly Used Boosting Models
Ensemble Learning

Gradient Boosting

Gradient Boosting is a powerful boosting ensemble learning technique for classification and regression tasks.

How does Gradient Boosting work?

  1. Base Model Initialization: The process starts with initializing a base model as the first weak learner. This initial model makes predictions, but they may not be very accurate;
  2. Residual Calculation: The difference between the actual target values and the predictions of the current model is calculated. These differences, known as residuals or errors, represent the "residuals" the next model will try to correct;
  3. New Model Fitting: A new weak learner is fitted to predict the residuals from the previous step. This new model aims to correct the mistakes made by the previous model;
  4. Combining Predictions: The new model's predictions are added to the predictions of the previous model. The combined predictions start to approximate the actual target values more closely;
  5. Iterative Process: Steps 3 and 4 are repeated for a specified number of iterations (or until a stopping criterion is met). In each iteration, a new model is fit to predict the residuals of the combined predictions from previous iterations;
  6. Final Prediction: After completing all iterations, the final prediction is obtained by adding the weak learners' predictions together. This ensemble of models forms a strong learner that has learned to correct the errors of the previous models.

    Note

    We can also calculate feature importance using the trained model's .feature_importances_ attribute.

Example

Note

It's important to notice that GradientBoostingRegressor and GradientBoostingClassifier classes in Python are designed to use only DecisionTreeRegressor and DecisionTreeClassifier as base models of an ensemble!

Code Description
  • Creating the GradientBoostingClassifier:
  • A GradientBoostingClassifier instance named clf is created.
    The parameter n_estimators is set to 100, indicating the number of boosting iterations to perform during training.
  • Training the Classifier:
  • The clf classifier is trained on the training data (X_train, y_train) using the .fit( method.
    During each boosting iteration, a decision tree classifier is fit to the negative gradient of the loss function with respect to the current predictions.
  • Making Predictions:
  • Predictions for the testing data (X_test) are made using the trained classifier's .predict() method.
    The predicted labels are stored in the y_pred array.
    You can find the official documentation with all the necessary information about implementing this model in Python on the official website. Go here if needed.

    Can we use SVC (Support Vector Classifier) as base model of Gradient Boosting Classifier in Python?

    Select the correct answer

    Everything was clear?

    Section 3. Chapter 4
    course content

    Course Content

    Ensemble Learning

    Gradient Boosting

    Gradient Boosting is a powerful boosting ensemble learning technique for classification and regression tasks.

    How does Gradient Boosting work?

    1. Base Model Initialization: The process starts with initializing a base model as the first weak learner. This initial model makes predictions, but they may not be very accurate;
    2. Residual Calculation: The difference between the actual target values and the predictions of the current model is calculated. These differences, known as residuals or errors, represent the "residuals" the next model will try to correct;
    3. New Model Fitting: A new weak learner is fitted to predict the residuals from the previous step. This new model aims to correct the mistakes made by the previous model;
    4. Combining Predictions: The new model's predictions are added to the predictions of the previous model. The combined predictions start to approximate the actual target values more closely;
    5. Iterative Process: Steps 3 and 4 are repeated for a specified number of iterations (or until a stopping criterion is met). In each iteration, a new model is fit to predict the residuals of the combined predictions from previous iterations;
    6. Final Prediction: After completing all iterations, the final prediction is obtained by adding the weak learners' predictions together. This ensemble of models forms a strong learner that has learned to correct the errors of the previous models.

      Note

      We can also calculate feature importance using the trained model's .feature_importances_ attribute.

    Example

    Note

    It's important to notice that GradientBoostingRegressor and GradientBoostingClassifier classes in Python are designed to use only DecisionTreeRegressor and DecisionTreeClassifier as base models of an ensemble!

    Code Description
  • Creating the GradientBoostingClassifier:
  • A GradientBoostingClassifier instance named clf is created.
    The parameter n_estimators is set to 100, indicating the number of boosting iterations to perform during training.
  • Training the Classifier:
  • The clf classifier is trained on the training data (X_train, y_train) using the .fit( method.
    During each boosting iteration, a decision tree classifier is fit to the negative gradient of the loss function with respect to the current predictions.
  • Making Predictions:
  • Predictions for the testing data (X_test) are made using the trained classifier's .predict() method.
    The predicted labels are stored in the y_pred array.
    You can find the official documentation with all the necessary information about implementing this model in Python on the official website. Go here if needed.

    Can we use SVC (Support Vector Classifier) as base model of Gradient Boosting Classifier in Python?

    Select the correct answer

    Everything was clear?

    Section 3. Chapter 4
    some-alt