Frage

I have a custom environment with a multi-discrete action space.

The action and observation spaces are as follows:

Action:

MultiDiscrete([  3 121 121 121   3 121 121 121   3 121 121 121   3 121 121 121   3 121
 121 121   3 121 121 121   3 121 121 121   3 121 121 121   3 121 121 121
   3 121 121 121   3 121 121 121   3 121 121 121   3 121 121 121   3 121
 121 121   3 121 121 121   3 121 121 121   3 121 121 121])

Observation:

MultiDiscrete([100   3   2 121   2 121   2 121   2 121   2 121   2 121   2 121   2 121
   2 121   2 121   2 121   2 121   2 121   2 121   2 121   2 121   2 121
   2 121   2 121   2 121   2 121   2 121   2 121   2 121   2 121   2 121
   2 121   2 121   2 121   2 121   2 121   2 121   2 121   2 121   2 121
 121 121 121 121 121 121 121 121 121 121 121 121 121 121 121 121 121 121
 121 121 121 121 121 121 121 121 121 121 121 121 121 121 121])

I am having an extremely tough time finding an agent (for example in keras-rl) that is capable of handling these spaces.

This issue: https://github.com/keras-rl/keras-rl/issues/224 indicates that the keras-rl DDPG agent is capable of handling a multi-discrete action space, but the model has a float output that I cannot use as an action for the step() function, which expects an integer output!

Most other agents seem to use a tanh activation layer, or some layer that produces a binary output. I need an output in the same shape as my action space.

How can this be handled?

Keine korrekte Lösung

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit datascience.stackexchange
scroll top