문제

My deep learning lecturer told us that if a hidden node has identical input weights to another, then the weights will remain the same over the training/there will be no separation. This is confusing to me, because my expectation was that you should only need the input or output weights to be different. Even if the input nodes are identical, the output weights would affect the backpropagated gradient at the hidden nodes and create divergence. Why isn't this the case?

도움이 되었습니까?

해결책

It is fine if a hidden node has identical initial weights with nodes in a different layer, which is what I assume you mean by output weights. The problem with weight-symmetry arises when nodes within the same layer that are connected to the same inputs with the same activation function are initialized identically.

To see this, the output of a node $i$ within a hidden layer is given by $$\alpha_i = \sigma(W_i^{T}x + b) $$ where $\sigma$ is the activation function, $W$ is the weight matrix, $x$ is input, $b$ is bias.

If the weights $W_{i}=W_{j}$ are identical for nodes $i,j$ (note that bias is typically initialized to 0), then $\alpha_i = \alpha_j$ and the backpropagation pass will update both nodes identically.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 datascience.stackexchange
scroll top