I understand the backpropagation algorithm of neural networks, and how the error propagates backwards in layers. That is, I understand that given a 3-layer feed forward network, the amount to change W1 is dependent on the weights from layers 2 and 3, as well as the derivatives of their activation functions.

Question: When your first layer is an embedding layer (i.e., consider initializing the embedding matrix with glove), how does the network update that matrix using backpropagation? How do you represent that layer as an equation consisting of the input and some weight matrix?

没有正确的解决方案

许可以下: CC-BY-SA归因
scroll top