Why is Regularization after PCA or Factor Analysis a bad idea?

https://datascience.stackexchange.com/questions/80164

machine-learning
pca
regularization
ridge-regression
exploratory-factor-analysis

13-12-2020
|

题

I have done Factor Analysis on my data and applied various machine learning models on it. I particularly find it giving high MSE value for Ridge and Lasso Regression compared to other models. I want to know the reason why this happens.

解决方案

In principle, PCA is unsupervised and therefore label agnostic. That means the down projections forced into the PCs may as well not be related to what the model is trying to predict. That may be able to measure with the amount of variance your PCs are capturing.

In essence, PCA shall never be used as a means for regularisation but rather for dimensionality reduction. You could alternatively try a VIF approach, although not sure what your exact goal is.

许可以下： CC-BY-SA 和归因

不隶属于 datascience.stackexchange