Pergunta

I have done Factor Analysis on my data and applied various machine learning models on it. I particularly find it giving high MSE value for Ridge and Lasso Regression compared to other models. I want to know the reason why this happens.

Foi útil?

Solução

In principle, PCA is unsupervised and therefore label agnostic. That means the down projections forced into the PCs may as well not be related to what the model is trying to predict. That may be able to measure with the amount of variance your PCs are capturing.

In essence, PCA shall never be used as a means for regularisation but rather for dimensionality reduction. You could alternatively try a VIF approach, although not sure what your exact goal is.

Licenciado em: CC-BY-SA com atribuição
scroll top