Which multiplication and addition factor to use when doing adaptive learning rate in neural networks?

https://stackoverflow.com/questions/7345135

27-10-2019
|

Question

I am new to neural networks and, to get grip on the matter, I have implemented a basic feed-forward MLP which I currently train through back-propagation. I am aware that there are more sophisticated and better ways to do that, but in Introduction to Machine Learning they suggest that with one or two tricks, basic gradient descent can be effective for learning from real world data. One of the tricks is adaptive learning rate.

The idea is to increase the learning rate by a constant value a when the error gets smaller, and decrease it by a fraction b of the learning rate when the error gets larger. So basically the learning rate change is determined by:

+(a)

if we're learning in the right direction, and

-(b * <learning rate>)

if we're ruining our learning. However, on the above book there's no advice on how to set these parameters. I wouldn't expect a precise suggestion since parameter tuning is a whole topic on its own, but just a hint at least on their order of magnitude. Any ideas?

Thank you,
Tunnuz

Solution

I haven't looked at neural networks for the longest time (10 years+) but after I saw your question I thought I would have a quick scout about. I kept seeing the same figures all over the internet in relation to increase(a) and decrease(b) factor (1.2 & 0.5 respectively).

I have managed to track these values down to Martin Riedmiller and Heinrich Braun's RPROP algorithm (1992). Riedmiller and Braun are quite specific about sensible parameters to choose.

See: RPROP: A Fast Adaptive Learning Algorithm

I hope this helps.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow