Decision Tree Summary

Let's now look at some of Decision Tree's peculiarities.

Interpretability.
Unlike most Machine Learning algorithms, Decision Trees are easy to visualize and interpret;
No data preparation required.
The decision Tree requires none to very little data preparation. It does not need scaling/normalization. It can also handle missing values and does not suffer from outliers much;
Provides feature importances.
While training, a Decision Tree calculates feature importances that represent how impactful each feature was to form the Tree. You can get feature importances using the .feature_importances_ attribute;
Computational complexity.
Suppose m is a number of features and n is a number of training instances. The complexity of a Decision Tree training is O(n·m·log(m)), so the training is quite fast unless there is a large number of features. Also, the complexity of predicting is O(log(n)), so the predictions are fast;
Not suitable for large datasets.
Although Decision Trees may work great for small sets, they usually don't work well for large datasets. Using Random Forest is preferable for large datasets;
Decision Trees are unstable.
Small changes in hyperparameters or data may cause a very different tree. Although it is a disadvantage for a single Tree, it will benefit us in a Random Forest, as you will see in the next section.

And here is a little summary:

Advantages	Disadvantages
Interpretable	Overfitting
Fast training	Unstable
Fast predictions	Not suitable for large datasets
No feature scaling required
Provides feature importances
Usually robust to outliers

Все було зрозуміло?

Дякуємо за ваш відгук!

Секція 3. Розділ 5

Запитати АІ

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

Зміст курсу

Classification with Python

Let's now look at some of Decision Tree's peculiarities.

Interpretability.
Unlike most Machine Learning algorithms, Decision Trees are easy to visualize and interpret;
No data preparation required.
The decision Tree requires none to very little data preparation. It does not need scaling/normalization. It can also handle missing values and does not suffer from outliers much;
Provides feature importances.
While training, a Decision Tree calculates feature importances that represent how impactful each feature was to form the Tree. You can get feature importances using the .feature_importances_ attribute;
Computational complexity.
Suppose m is a number of features and n is a number of training instances. The complexity of a Decision Tree training is O(n·m·log(m)), so the training is quite fast unless there is a large number of features. Also, the complexity of predicting is O(log(n)), so the predictions are fast;
Not suitable for large datasets.
Although Decision Trees may work great for small sets, they usually don't work well for large datasets. Using Random Forest is preferable for large datasets;
Decision Trees are unstable.
Small changes in hyperparameters or data may cause a very different tree. Although it is a disadvantage for a single Tree, it will benefit us in a Random Forest, as you will see in the next section.

And here is a little summary:

Advantages	Disadvantages
Interpretable	Overfitting
Fast training	Unstable
Fast predictions	Not suitable for large datasets
No feature scaling required
Provides feature importances
Usually robust to outliers

Все було зрозуміло?

Дякуємо за ваш відгук!

Секція 3. Розділ 5