Question

People generally avoid using Dropout at the I/p layer itself...

But isn't it better to do it is my main question?

My reasoning: Adding dropout (given that it's randomized it will probably end up acting like another regularizer) should make the model more robust. It will make it more independent of a given set of features which matter always and let the NN find other patterns too and then the model generalizes better even though we might be missing some important features, but that's randomly decided per epoch.

What am I missing/incorrect interpretation?

Isn't this equivalent to what we generally do by removing features one by one and then rebuilding the non-NN-based model to see the importance of it?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top