Including the validation file in the training process after tunning

https://datascience.stackexchange.com/questions/66183

machine-learning
training
hyperparameter-tuning

20-10-2020
|

Question

Should I include the validation file in the training process after finishing the tuning process (e.g. searching for params using the validation file)?

Solution

It depends on the distribution of the train, valid and holdout/test set.

There are a couple of possibilities (basically permutations). In general any different distribution=covariate shift is bad and you should repair it. If this is the case, including valid is the least of your problems (but you should include it in this case to make corrections) and you should worry about covarite shift.

If distributions are the same between the sets, it wont make any negative difference and it could only help if you add the valid-hyperparam tuning dataset to the train.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange