Is this a good practice of feature engineering?

https://datascience.stackexchange.com/questions/33117

31-10-2019
|

Question

I have a practical question about feature engineering... say I want to predict house prices by using logistic regression and used a bunch of features including zip code. Then by checking the feature importance, I realize zip is a pretty good feature, so I decided to add some more features based on zip - for example, I go to census bureau and get the average income, population, number of schools, and number of hospitals of each zip. With these four new features, I find the model performances better now. So I add even more zip-related features... And this cycle goes on and on. Eventually the model will be dominated by these zip-related features, right?

My questions:

Does it make sense doing these in the first place?
If yes, how do I know when is a good time to stop this cycle?
If not, why not?

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange