Conteúdo do Curso
Ensemble Learning
1. Basic Principles of Building Ensemble Models
Ensemble Learning
ExtraTrees
Extra Trees, short for Extremely Randomized Trees, is a bagging ensemble learning technique that builds upon the concept of decision trees to create a more robust and diverse model.
How does ExtraTrees algorithm work?
It is a variation of the Random Forest algorithm but introduces even more randomness into the tree-building process:
- The extra trees algorithm, like the random forests algorithm, creates many decision trees, but the sampling for each tree is random, without replacement;
- A specific number of features from the total set of features is also selected randomly for each tree;
- Extra trees' most important and unique characteristic is the random selection of a splitting value for a feature. Instead of calculating a locally optimal value using Gini or entropy to split the data, the algorithm randomly selects a split value. This makes the trees diversified and uncorrelated.
Note
We can also use
.feature_importances_
attribute to measure the features' impact on the model's result.
Example
We can use ExtraTrees in Python just like Random Forest using the ExtraTreesClassifier
or ExtraTreesRegressor
classes:
Tudo estava claro?
Conteúdo do Curso
Ensemble Learning
1. Basic Principles of Building Ensemble Models
Ensemble Learning
ExtraTrees
Extra Trees, short for Extremely Randomized Trees, is a bagging ensemble learning technique that builds upon the concept of decision trees to create a more robust and diverse model.
How does ExtraTrees algorithm work?
It is a variation of the Random Forest algorithm but introduces even more randomness into the tree-building process:
- The extra trees algorithm, like the random forests algorithm, creates many decision trees, but the sampling for each tree is random, without replacement;
- A specific number of features from the total set of features is also selected randomly for each tree;
- Extra trees' most important and unique characteristic is the random selection of a splitting value for a feature. Instead of calculating a locally optimal value using Gini or entropy to split the data, the algorithm randomly selects a split value. This makes the trees diversified and uncorrelated.
Note
We can also use
.feature_importances_
attribute to measure the features' impact on the model's result.
Example
We can use ExtraTrees in Python just like Random Forest using the ExtraTreesClassifier
or ExtraTreesRegressor
classes:
Tudo estava claro?