What happens when you have highly correlated columns in a dataset
-
22-10-2019 - |
Pergunta
I am doing a regression model
. And I was wondering what would be the consequence if we have two or more Highly correlated
columns in the dataset ? is that something that can decrease the accuracy of the model ?
Answering this question would help decide how to deal with it. PCA
would be the best option here ?
Solução
Having highly correlated features is a type of redundancy in features. And yes, it effects a regression model if you are having highly correlated features. A very nice explanation is given here.
PCA is a nice choice when it comes to dimensionality reduction.
Licenciado em: CC-BY-SA com atribuição
Não afiliado a datascience.stackexchange