The selective pair of months on the scatter plot looked good, didn't it? Maybe there were no key differences between 'areas' on the plot, but at least there were no outliers outside the respective groups, and in general, all groups were disjoint.

Finally, let's find out the yearly dynamics for each cluster, i.e. let's build the line plot representing the monthly averages for each group of points.

Clustering is a common data science task of grouping a set of objects into groups, in which the dissimilarity between objects would be minimal. Cluster analysis itself is not an algorithm, it is just a general task to be solved. There are many clustering algorithms that exist, but we will stop on certain four.

The first algorithm to be considered is the K-Means. This algorithm uses centroids to split the points into clusters. In this section, you will consider how to implement such an algorithm and how to choose the number of clusters.

The second algorithm that will be considered is the K-Medoids algorithm. It works the same way as the previous one (K-Means) but uses medoids as the 'center' points of clusters. In this section, you will get to know what is the difference between these algorithms, one more way to define a possible number of clusters, and of course the algorithm implementation.

The third algorithm in this course is Hierarchical Clustering. This algorithm can be easily visualized by using dendrograms. In this section, you will get to know how to implement such an algorithm and how can it be tuned to improve the clustering quality.

The last algorithm to be considered is probably the hardest in terms of math. In this section, you will superficially be introduced to such an algorithm, why should such a hard algorithm be used, and of course implementation.

Visualizing the Dynamics Across Clusters

Solution