Question

The standard setup when training a neural network seems to be to split the data into train and test sets, and keep running until the scores stop improving on the test set. Now, the problem: there is a certain amount of noise in the test scores, so the single best score may not correspond to the state of the network which is most likely to be best on new data.

I've seen a few papers point to a specific epoch or iteration in the training as being "best by cross-validation" but I have no idea how that is determined (and the papers do not provide any details). The "best by cross-validation" point is not the one with the best test score.

How would one go about doing this type of cross validation? Would it be by doing k-fold on the test set? Okay, that gives k different test scores instead of one, but then what?

Was it helpful?

Solution

I couldn't say what the authors refer to by

best by cross-validation

but I'll mention a simple and general procedure that's out there:

You are correct that analyzing one estimate of the generalization performance using one training and one test set is quite simplistic. Cross-validation can help us understand how this performance varies across datasets, instead of wonder whether we got lucky/unlucky with our choice of train/test datasets.

Split the whole dataset into k folds (or partitions), and train/test the model k times using different folds. When you're done, you can compute the mean performance and a variance that will be of utmost importance assessing confidence in the generalization performance estimate.

OTHER TIPS

Best by cross validation The whole data can be divided into training and testing. You can not touch the testing data set for any kind of training. Keep it away!!

For the training data you can use it freely to specify the best classification model. But how to know if the current setting is the best one, you do cross validation by using only training data set, dividing the training data set into k folds. Based on this cross validation result you select the setting that gives the best result. Best by cross validation

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top