# k-NN Summary

From what we have learned, we can tell that k-NN is easy to implement but requires scaling. It has some more peculiarities:

**k-NN does not require training.**

Unlike many other algorithms,**k-NN does not learn anything during training**. It just needs to keep the information about all data points coordinates.

But since all the calculations are performed during predictions, the**prediction time is larger**compared to other algorithms;**k-NN is a greedy algorithm.**

The model calculates distances to each training instance to find the neighbors. Thus, it may get**painfully slow for large datasets**;**Easy to add new training data.**

Since the model does not need to train, we can just add new training data points, and the predictions will adjust;**The curse of dimensionality.**

Some algorithms really struggle when the number of dimensions(features) is large. And unfortunately, k-NN has this problem too. The distance between two points in high-dimensional space tends to become similar regardless of the actual values of the features, so it becomes much harder to determine whether the instances are similar.

So, here is a little summary of the k-NN algorithm:

Advantages | Disadvantages |

No training time | Needs feature scaling |

Easy to add new training data | Prediction time is high |

Doesn't work well with a large number of training instances | |

Doesn't work well with a large number of features |

Everything was clear?

Section 1. Chapter 8

Course Content

Classification with Python

## Classification with Python

5. Comparing Models

# k-NN Summary

From what we have learned, we can tell that k-NN is easy to implement but requires scaling. It has some more peculiarities:

**k-NN does not require training.**

Unlike many other algorithms,**k-NN does not learn anything during training**. It just needs to keep the information about all data points coordinates.

But since all the calculations are performed during predictions, the**prediction time is larger**compared to other algorithms;**k-NN is a greedy algorithm.**

The model calculates distances to each training instance to find the neighbors. Thus, it may get**painfully slow for large datasets**;**Easy to add new training data.**

Since the model does not need to train, we can just add new training data points, and the predictions will adjust;**The curse of dimensionality.**

Some algorithms really struggle when the number of dimensions(features) is large. And unfortunately, k-NN has this problem too. The distance between two points in high-dimensional space tends to become similar regardless of the actual values of the features, so it becomes much harder to determine whether the instances are similar.

So, here is a little summary of the k-NN algorithm:

Advantages | Disadvantages |

No training time | Needs feature scaling |

Easy to add new training data | Prediction time is high |

Doesn't work well with a large number of training instances | |

Doesn't work well with a large number of features |

Everything was clear?

Section 1. Chapter 8