ANN: How to correctly pick initial weights to avoid local minima?

https://stackoverflow.com/questions/17422887

02-06-2022
|

Question

In backpropagation training, during gradient descent down the error surface, network with large amount of neurons in hidden layer can get stuck in local minimum. I have read that reinitializing weights to random numbers in all cases will eventually avoid this problem. This means that there always IS a set of "correct" initial weight values. (Is this safe to assume?)

I need to find or make an algorithm that finds them.

I have tried googling the algorithm, tried devising it myself but to no avail. Can anyone propose a solution? Perhaps a name of algorithm that I can search for?

Note: this is a regular feed-forward 3-layer burrito :)

Note: I know attempts have been made to use GAs for that purpose, but that requires re-training the network on each iteration which is time costly when it gets large enough.

Thanks in advance.

Solution

There is never a guarantee that you will not get stuck in a local optimum, sadly. Unless you can prove certain properties about the function you are trying to optimize, local optima exist and hill-climbing methods will fall prey to them. (And typically, if you can prove the things you need to prove, you can also select a better tool than a neural network.)

One classic technique is to gradually reduce the learning rate, then increase it and slowly draw it down, again, several times. Raising the learning rate reduces the stability of the algorithm, but gives the algorithm the ability to jump out of a local optimum. This is closely related to simulated annealing.

I am surprised that Google has not helped you, here, as this is a topic with many published papers: Try terms like, "local minima" and "local minima problem" in conjunction with neural networks and backpropagation. You should see many references to improved backprop methods.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow