Understanding of number of cells in layers of sequential models

https://datascience.stackexchange.com/questions/86327

17-12-2020
|

Question

I am trying to teach myself RNN, but I have a question.

And so, imagine 2 layers: an input layer with three neurons $(x1, x2, x3)$ and a classic recurrent layer with 2 neurons and an activation function f. I will write out the outputs of each neuron of the recurrent layer. $ht1 = f (W * [x1, [0, 0, N]] + b) ht2 = f (W * [x2, ht1] + b)$. It turns out that $x3$ is not used, what to do in this case?

And also, let's imagine a slightly different RNN architecture.

An input layer with two neurons $(x1, x2)$ and a classic recurrent layer with 3 neurons and an activation function f. I will write out the outputs of each neuron of the recurrent layer. $ht1 = f (W * [x1, [0, 0, N] + bias]) ht2 = f (W * [x2, ht1] + bias)$. It turns out that the 3rd neuron of the RNN layer is not used, what to do in this case?

Please help me figure out how the neural network works in these cases. Thanks!

UPD:
I realized that i don't know how recurrent neural networks work if number of neurons in recurrent layer doesn't equal(!=) number of inputs

I have one thought:
number of inputs has to always be equal number of neurons in RNN layer. But code below contradicts with my guess.

model = Sequential()
model.add(Embedding(maxWordsCount, 256, input_length = inp_words))
model.add(SimpleRNN(128, activation='tanh'))
model.add(Dense(maxWordsCount, activation='softmax'))
model.summary()

That's model for predicting next word.

Solution

RNN cell versus neuron

You are hesitating between RNN cells and neurons. I understand you refer to RNN cells in your question. And so, traditionally any layer of a sequence model will always have the same number of cells as the length of your sequence or embedding size.

See difference between cell and neuron here: Difference between cell state and hidden state

See input size for sequence models here: How can I picture an unfolded RNN as a normal Feed Forward Network?

Re; code snippet

In SimpleRNN, the first argument is the number of units in each cell i.e. number of neurons in each cell. The number of cells is not an argument in sequential layers, and always is equal to the length of your input sequences or size of embedding layer.

model = Sequential()
model.add(Embedding(maxWordsCount, 256, input_length = inp_words))
# Below, 128 is the neurons of each cell, and relates to the cell memory capacity.
model.add(SimpleRNN(128, activation='tanh'))

OTHER TIPS

Inside SimpleRNN the input (of dimensionality 256) is projected onto a representation space of dimensionality 128, by means of a matrix multiplication. The RNN operations are with these vectors of size 128. If you take a look at the source code of SimpleRNN, you can see that the projection matrix is is stored in a member variable called kernel. You can see how in method SimpleRNNCell.call one of the first things is to project the inputs with K.dot(inputs, self.kernel).

P.D.: To me, the "neurons" analogy has always been misleading. I like to think about neural network in terms of differentiable matrix operations: matrix multiplication, matrix addition, position-wise transformations like sigmoid, hyperbolic tangent, ReLU, etc. This makes it easier to reason about the dimensionality of the input and output of each computation step.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange