質問

I wonder whether one epoch using mini-batch gradient descent is slower than one epoch using just batch gradient descent.

At least I understand that one iteration of mini-batch gradient descent should be faster than one iteration of batch gradient descent.

However, if I understand it correctly, since the mini-batch gradient descent must update the weights by the number of the batch size in one epoch, the training would be slower than the batch gradient descent, which computes and updates the weights only once in one epoch.

Is this correct? In that case, is it worth worrying about the loss of the overall training time?

正しい解決策はありません

ライセンス: CC-BY-SA帰属
所属していません datascience.stackexchange
scroll top