Why is Regularization after PCA or Factor Analysis a bad idea?

https://datascience.stackexchange.com/questions/80164

machine-learning
pca
regularization
ridge-regression
exploratory-factor-analysis

13-12-2020
|

Question

I have done Factor Analysis on my data and applied various machine learning models on it. I particularly find it giving high MSE value for Ridge and Lasso Regression compared to other models. I want to know the reason why this happens.

Solution

In principle, PCA is unsupervised and therefore label agnostic. That means the down projections forced into the PCs may as well not be related to what the model is trying to predict. That may be able to measure with the amount of variance your PCs are capturing.

In essence, PCA shall never be used as a means for regularisation but rather for dimensionality reduction. You could alternatively try a VIF approach, although not sure what your exact goal is.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange