Course Content
Classification with Python
Classification with Python
5. Comparing Models
Logistic Regression Summary
Let's take a closer look at Logistic Regression's pros and cons.
- Logistic Regression uses an iterative process called Gradient Descent to find the parameters.
Since the training is an iterative process, at any iteration, you can painlessly add new training data. Even once it is trained, you can provide additional training data and perform a few more iterations to improve the model; - Logistic Regression is fast.
Compared to other algorithms, the training time is pretty short.
The predictions are also very fast, unlike the k-NN classifier.
Also, the computational complexity is linear with respect to the dataset's size. It means that the Logistic Regression is fast to train with datasets containing a lot of instances; - Logistic Regression scales poorly to the number of features.
The model struggles from a curse of dimensionality. For it to work well with a respectable amount of features, you need a lot of instances.
Also, thePolynomialFeatures
class creates many features, making things even worse; - Logistic Regression predicts probabilities. One of the steps Logistic Regression makes is predicting the probabilities. It can be helpful in many tasks when we need to know how confident the model is in its predictions.
To sum it all up, here is a table with Logistic Regression's advantages and disadvantages.
Advantages | Disadvantages |
---|---|
Fast training | With regularization needs feature scaling |
Scales well to a large number of training instances | Linear Decision Boundary without the PolynomialFeatures |
Easy to add new training data | Doesn't work well with a large number of features |
Fast predictions | Prone to overfitting, especially with PolynomialFeatures |
Predicts probabilities |
All in all, Logistic Regression is a good algorithm for simple tasks with few features. But it handles the data with many features poorly.
Everything was clear?
Thanks for your feedback!
Section 2. Chapter 7