Pregunta

I have been taught that for each epoch in training, we perform a training phase, and then a validation phase where we decide whether the new set of parameters is better than the current best. This selection of parameters is done using the same loss function that we use for training, but applied to validation data instead of training data.

My question is, does the metric used in the validation phase really have to be the same loss function, or can it be some other metric, such as accuracy or average precision?

¿Fue útil?

Solución

It doesn't have to be the same function and usually is not.

The point of the validation set is to measure how well our model is actually doing. This is only useful if we measure it on a metric that has some value to us. For instance, in classification, no one is concerned about which model achieves the least cross-entropy, rather which one achieves the highest accuracy, which makes more sense from a business standpoint.


One clarification I'd like to make on your initial statement that doesn't have to do with the question itself.

I have been taught that for each epoch in training, we perform a training phase, and then a validation phase where we decide whether the new set of parameters is better than the current best. This selection of parameters is done using the same loss function that we use for training, but applied to validation data instead of training data.

This is incorrect! An epoch is technically the point at which the model has seen the training dataset once. After the end of the epoch, we usually do a pass on the validation set to measure how well our model is actually doing. However, there is no hyperparameter tuning performed at this step, as the model hasn't been fully trained yet. A model typically takes multiple epochs to train, after which we can change the hyperparameters and start again. The only change we might make to the hyperparameters at the end of an epoch is something like a scheduled learning rate reduction.

Licenciado bajo: CC-BY-SA con atribución
scroll top