Principal Component Analysis
We can immediately list the most known tasks that PCA solves:
- Data compression
- Data visualization
- Noise reduction
As you learned in the last quiz, PCA can compress different data formats, from numerical datasets with time series and features to images with video frames.
What about data visualization? You can use PCA as a way to visualize multidimensional data. This is an important step for developing a machine learning model.
3D visualization can be difficult to understand if there is indeed a dependency in the data, while reducing the data to 2 dimensions allows us to see that the classes are indeed linearly separable and we can start building our classification model.
The last thing we'll look at is data noise reduction. PCA projects the data into a lower dimension. As expected, when projecting to a lower dimension, we lose data, but usually retain most of the information. This makes PCA an excellent tool for data compression and dimensionality reduction, which really speeds up the process of performing machine learning analysis.
Imagine that you have a dataset with 230 different variables for which you need to create a classification model. What can PCA be used for in this case?
Select the correct answer