Is there a disadvantage to letting a model train for a large number of epochs?

https://datascience.stackexchange.com/questions/44647

01-11-2019
|

سؤال

I created a model to solve a time series forecasting problem. I had a limited amount of time series with which I could train the model therefore I decided to augment the data. The data augmentation strategy I used is quite basic but has shown to increase the accuracy of my model.

I wrote my own data_generator which I use to train my model with, using the fit_generator function in keras. Essentially it takes in the whole training data set that I have, shuffles all the time series and the augmentation process takes place specifically in each batch. In each batch I randomly pick, per time series in the batch, start and end points, so that each batch contains varying length slices of each series within the batch. This creates obviously an almost endless stream of data but it is entirely reliant on the number of epochs the model is run for as the dataset is not augmented upfront. No noise or anything is applied to the data set, the augmentation is purely from varying the lengths of the time series and the start and end points of the series.

I observe that my loss continues to decrease over time I have tried 100, 500, 1000, 5000 and 10,000 epochs. In general the accuracy of the model predictions does get better but at some point with diminishing returns. It is hard to say when as I am still tuning the model architecture and hyperparameters.

Does such an augmentation strategy affect how I can interpret the loss of model? As the longer I train the model for the more "new" data it sees instead of constantly seeing the same data and training on it.

لا يوجد حل صحيح

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى datascience.stackexchange