Decision Tree Summary
Let's now look at some of Decision Tree's peculiarities.
- Interpretability.
Unlike most Machine Learning algorithms, Decision Trees are easy to visualize and interpret; - No data preparation required.
The decision Tree requires none to very little data preparation. It does not need scaling/normalization. It can also handle missing values and does not suffer from outliers much; - Provides feature importances.
While training, a Decision Tree calculates feature importances that represent how impactful each feature was to form the Tree. You can get feature importances using the.feature_importances_
attribute; - Computational complexity.
Suppose m is a number of features and n is a number of training instances. The complexity of a Decision Tree training is O(n·m·log(m)), so the training is quite fast unless there is a large number of features. Also, the complexity of predicting is O(log(n)), so the predictions are fast; - Not suitable for large datasets.
Although Decision Trees may work great for small sets, they usually don't work well for large datasets. Using Random Forest is preferable for large datasets; - Decision Trees are unstable.
Small changes in hyperparameters or data may cause a very different tree. Although it is a disadvantage for a single Tree, it will benefit us in a Random Forest, as you will see in the next section.
And here is a little summary:
Advantages | Disadvantages |
---|---|
Interpretable | Overfitting |
Fast training | Unstable |
Fast predictions | Not suitable for large datasets |
No feature scaling required | |
Provides feature importances | |
Usually robust to outliers |
¿Todo estuvo claro?
¡Gracias por tus comentarios!
Sección 3. Capítulo 5
Pregunte a AI
Pregunte a AI
Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla
Awesome!
Completion rate improved to 3.57
Decision Tree Summary
Desliza para mostrar el menú
Let's now look at some of Decision Tree's peculiarities.
- Interpretability.
Unlike most Machine Learning algorithms, Decision Trees are easy to visualize and interpret; - No data preparation required.
The decision Tree requires none to very little data preparation. It does not need scaling/normalization. It can also handle missing values and does not suffer from outliers much; - Provides feature importances.
While training, a Decision Tree calculates feature importances that represent how impactful each feature was to form the Tree. You can get feature importances using the.feature_importances_
attribute; - Computational complexity.
Suppose m is a number of features and n is a number of training instances. The complexity of a Decision Tree training is O(n·m·log(m)), so the training is quite fast unless there is a large number of features. Also, the complexity of predicting is O(log(n)), so the predictions are fast; - Not suitable for large datasets.
Although Decision Trees may work great for small sets, they usually don't work well for large datasets. Using Random Forest is preferable for large datasets; - Decision Trees are unstable.
Small changes in hyperparameters or data may cause a very different tree. Although it is a disadvantage for a single Tree, it will benefit us in a Random Forest, as you will see in the next section.
And here is a little summary:
Advantages | Disadvantages |
---|---|
Interpretable | Overfitting |
Fast training | Unstable |
Fast predictions | Not suitable for large datasets |
No feature scaling required | |
Provides feature importances | |
Usually robust to outliers |
¿Todo estuvo claro?
¡Gracias por tus comentarios!
Sección 3. Capítulo 5