Question

After performing a PCA and studying the proceeding i ask myself what the result is good for in the next step. From the PCA i learned how to visualize the dataset by lowering the dimension, i got handy new vectors to describe the members of the population in a more efficient way and i learned which original predictors correlate and contribute more than others.

I ask myself questions like, is there more to learn from PCA, is it a good idea to use the PCA model for a learning algorithms to perform better and is it possible to respect nomial or ordinal predictors in the PCA somehow?

Was it helpful?

Solution

Well PCA, as suggested above by @CarltonBanks, does help you remove features with the least correlation and use mash the features together such that they have the highest correlation.

To answer your question, how to visualize higher dimensions using PCA

  1. Transform the feature matrix with the number of components of your data set to 2 or 3
  2. This ensures you can represent your dataset in 2 or 3 dimensions. To simply see your answer just plot this transformed matrix into a 2d or 3d plot respectively.
  3. This helps you visualize a higher dimensionality data as a 2d or 3d entity so while using regression or some predictive modeling technique you can assess the trend of data.

Should we use PCA in machine learning algorithms more often?

Well, that strictly depends, using PCA reduces the accuracy of your data set so unless you need to save up some space caused due to a lot of features with bad correlation and the overall accuracy doesn't matter. If your machine learning model scenario is similar to this then it is ok to proceed.

However, most use of PCA is as you asked before for visualizing higher dimensionality data to determine the data trend and to check which model fits best.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top