Frage

I wonder whether one epoch using mini-batch gradient descent is slower than one epoch using just batch gradient descent.

At least I understand that one iteration of mini-batch gradient descent should be faster than one iteration of batch gradient descent.

However, if I understand it correctly, since the mini-batch gradient descent must update the weights by the number of the batch size in one epoch, the training would be slower than the batch gradient descent, which computes and updates the weights only once in one epoch.

Is this correct? In that case, is it worth worrying about the loss of the overall training time?

Keine korrekte Lösung

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit datascience.stackexchange
scroll top