Why is it bad to use the same test dataset over and over again?
-
31-10-2019 - |
Question
I am following this Google's series: Machine Learning Crash Course.
On the chapter about generalisation, they make the following statement:
Good performance on the test set is a useful indicator of good performance on the new data in general, assuming that:
- The test set is large enough.
- You don't cheat by using the same test set over and over.
Why exactly is the second point a bad one? As long as one does not use the test set for the training stage, why is it bad to keep using the same test set to test a model's performance? It's not like the model will get a bias by doing so (the test set is not updating any of the model's parameters internally).
No correct solution
Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange