Question

what does $a'$ mean in the "combining" equation in Dueling DQN? (top of the page 5)

$$Q(s,a; \theta, \alpha, \beta) = V(s; \theta, \beta) + \biggl( A(s, a; \theta, \alpha) - \frac{1}{N}\sum_{a'}^{N}A(s, a'; \theta, \alpha) \biggr)$$

Where there are $N$ actions to choose from;

  • $s$ is the incoming state (the input vector)
  • $a$ is the action taken? (the chosen action)
  • $a'$ I don't know what it represents in this context
  • $\theta$ represents the weights of the convolutional layers
  • $\alpha$ are the weights of the "Advantage stream" which outputs a vector
  • $\beta$ are the weights of the Value stream (which outputs a scalar)

Why not to simply use $a$ everywhere, why is $a'$ used in the average?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top