Pergunta

I wonder whether one epoch using mini-batch gradient descent is slower than one epoch using just batch gradient descent.

At least I understand that one iteration of mini-batch gradient descent should be faster than one iteration of batch gradient descent.

However, if I understand it correctly, since the mini-batch gradient descent must update the weights by the number of the batch size in one epoch, the training would be slower than the batch gradient descent, which computes and updates the weights only once in one epoch.

Is this correct? In that case, is it worth worrying about the loss of the overall training time?

Nenhuma solução correta

Licenciado em: CC-BY-SA com atribuição
scroll top