Question

I am doing a regression model. And I was wondering what would be the consequence if we have two or more Highly correlated columns in the dataset ? is that something that can decrease the accuracy of the model ? Answering this question would help decide how to deal with it. PCA would be the best option here ?

Was it helpful?

Solution

Having highly correlated features is a type of redundancy in features. And yes, it effects a regression model if you are having highly correlated features. A very nice explanation is given here.

PCA is a nice choice when it comes to dimensionality reduction.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top