Question

If an epoch is defined as the neural network training process after seeing the whole training data once. How is it that when starting the next epoch, the loss is almost always smaller than the first one? Does this mean that after an epoch the weights of the neural network are not reset? and each epoch is not a standalone training process?

Was it helpful?

Solution

An epoch is not a standalone training process, so no, the weights are not reset after an epoch is complete. Epochs are merely used to keep track of how much data has been used to train the network. It's a way to represent how much "work" has been done.

Epochs are used to compare how "long" it would take to train a certain network regardless of hardware. Indeed, if a network takes 3 epochs to converge, it will take 3 epochs to converge, regardless of hardware. If you had used time, it would be less meaningful as one machine could maybe do 1 epoch in 10 minutes, and another setup might only do 1 epoch in 45 minutes.

Neural networks (sadly) are usually not able to learn enough by seeing the data once, which is why multiple epochs are often required. Think about it as if you were studying a syllabus for a course. Once you finished the syllabus (first epoch), you go over it again to understand it even better (epoch 2, epoch 3, etc.)

OTHER TIPS

How is it that when starting the next epoch, the loss is almost always smaller than the first one? Does this mean that after an epoch the weights of the neural network are not reset?

Yes. The network weights are initialized once before the training starts. After every iteration, the weights are updated by backpropagation using the error gradients that you obtain from the batch of data fed to the network at that iteration. Once an epoch is done, the weights are now better optimized to your training data, meaning you get a lower training loss. The next epoch builds on the weights you got after the first epoch to improve the performance further. This is why the loss the will keep decreasing as the network is trained for more epochs (assuming the hyperparameters are set properly).

each epoch is not a standalone training process?

Yes. An epoch is a part of the training process. You improve the network's performance by training it for as many epochs as necessary to achieve the desired performance.

Yes, they don’t reset it. They train on the same set of weights continuously.

An epoch means the model completed training on the entire dataset once.

The loss is smaller because the model improves.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top