Question

I have a game environment I want to train an RL model on. This environment has 2 fundamental actions that the agent can take; "Left" or "Right" (say, 0 or 1).

However, the actions "Left" or "Right" can be taken in a discrete number of "degrees". For example, I can take action "Left" with degree 70% , or take action "Right" with degree 16%.

Assuming a discrete action space between 0-100% for each "Left" or "Right", making the total action space a discrete size of 201 (0-200 in increments of 1), does an agent learn the optimal degree to take either "Left" or "Right" in any given state?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top