Neural networks - adjusting weights

https://datascience.stackexchange.com/questions/18465

22-10-2019
|

Question

My question would be about backpropagation and understanding the terms feedforward NN vs backpropagation. I have two questions really:

If I understand correctly, even a feedforward network updates its weights (via a delta rule for example). Isn't this also backpropagation? You have some random weights, run the data through the network, then cross-validate it with the desired output, then update the rules. This is backpropagation, right? So what's the difference between FFW NN and RNN? If you can't backpropagate on the other hand, how do you update the weights in a FFW NN?
I've seen NN architectures looking like this:

Basically all the neurons are being fed the same data, right? OK, the weights are randomized and thus different in the beginning, but how do you make sure Temp. Value #1 will be different from Temp. Value #2, if you use the same update rule?

Thank you!

Solution

I'm not an expert on the backpropagation algorithm, however I can explain something. Every neural network can update it's weights. It may do this in different ways, but it can. This is called backpropagation, regardless of the network architecture.

A feed forward network is a regular network, as seen in your picture. A value is received by a neuron, then passed on to the next one.

A recurrent neural network is almost the same as a FFN, the difference being that the RNN has some connections point 'backwards'. E.g. a neuron is connecteded to a neuron that has already done his 'job' during backpropagation. Because of this, the activations of the previous output have an effect on the new output.

On question #2

Interesting question. This has to do with weight initialization. Yes, you're right, each neuron in the hidden layer accepts the same connections. However, during the initalization process, they have received a random weight. Depending on your NN libary, the neurons might also have been initialized with a random bias.

So even though the same rule is applied, each neuron has different outcomes as all it's connections have different weights than the other neurons weights.

On your comment: just because all the neurons happen to have the same backpropagation function, doesn't mean they will end up with the same weights.

As they are initialized with random weights, each neurons error is different. Thus they have a different gradient, and will get new weights.

You also have to keep in mind that for a certain output to be reached, there are multiple solutions (due to non-linearity). So due to initialized random weights, one neuron might be close to a certain solution while another neuron is closer to the other.

Additionally, as was stated in the comments, a network works as a whole. The output neuron is also non-linear, and for most test cases, the output should be non-linear and the output neuron most likely requires that the hidden neurons activate at different input values.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange