Pergunta

Should I include the validation file in the training process after finishing the tuning process (e.g. searching for params using the validation file)?

Foi útil?

Solução

It depends on the distribution of the train, valid and holdout/test set.

There are a couple of possibilities (basically permutations). In general any different distribution=covariate shift is bad and you should repair it. If this is the case, including valid is the least of your problems (but you should include it in this case to make corrections) and you should worry about covarite shift.

If distributions are the same between the sets, it wont make any negative difference and it could only help if you add the valid-hyperparam tuning dataset to the train.

Licenciado em: CC-BY-SA com atribuição
scroll top