This ticket was closed as off-topick, so I answered it in the comment. So here my answer as a post.
My answer is generic and apply to all system, not just Theano. As each iteration of your loop depend on the previous one, you can't paralelize your iterations completly. You could parallelize the u=data[t]
as it don't depend on the previous x. You could parallelize dot( Win, vstack((1,u)) )
for the same reason. But you can't parallelize dot(W,x)
and what depend on it like tanh and the lines afters.
If you want to optimize this, you can move outside the loop all computation that don't depend on x. This will allow to work with more data at the same time and so could be faster. So the dot(win, ...)
could be speed up. But this will raise the memory usage.