Пристосувати дані в модель

Now that our data is ready, let's fit it into the PCA model.


python

We have reduced the dimension of the dataset from 13 characteristics to 2! Now we can visualize the resulting components using the seaborn and matplotlib libraries:


python

It is logical, if you have a question, how to check the effectiveness of a particular PCA model. The performance of the PCA can be “counted” in two ways. The first is how much information the resulting components contain. The number of components that we decide to leave will determine how much information will eventually remain from the dataset. For example, let's display the amount of explained variance ratio:


python

Above is the result of the PCA model, which contains 13 main components from the wine dataset (i.e. the same number of variables as it was originally). So, you can see that the first component captures 36% of the information, two components capture 55%, three components capture 66%, and so on.

The graph makes it easy to visualize the number of components required to capture varying degrees of data variability:

The second way to evaluate the performance of a PCA model is to check the performance of other machine learning models into which we are going (if we really need to) fit the dataset. We can search for the optimal set of 3 variables - for example, the amount of time the machine learning model runs, the percentage of accuracy of the model, and the numbers of principal components.

Quiz

Why do you think only 3 components in the presented dataset can explain as much as 92% of the data?

Все було зрозуміло?

Дякуємо за ваш відгук!

Секція 3. Розділ 3

Запитати АІ

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

Зміст курсу

Метод Головних Компонент

1. Що таке аналіз головних компонент

Вступ Практичне застосування PCA Математична ідея Приклади реальних проблем Як пояснити отримані результати?

2. Основні поняття РСА

3. Побудова моделі

Scikit-learn для PCA Досліджуємо набір даних Пристосувати дані в модель Виклик

4. Аналіз результатів

Пояснити отримані компоненти А що далі?Стиснення даних Зниження рівня шуму Стиснення зображень