Вопрос

I have done Factor Analysis on my data and applied various machine learning models on it. I particularly find it giving high MSE value for Ridge and Lasso Regression compared to other models. I want to know the reason why this happens.

Это было полезно?

Решение

In principle, PCA is unsupervised and therefore label agnostic. That means the down projections forced into the PCs may as well not be related to what the model is trying to predict. That may be able to measure with the amount of variance your PCs are capturing.

In essence, PCA shall never be used as a means for regularisation but rather for dimensionality reduction. You could alternatively try a VIF approach, although not sure what your exact goal is.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с datascience.stackexchange
scroll top