Вопрос

I have a game environment I want to train an RL model on. This environment has 2 fundamental actions that the agent can take; "Left" or "Right" (say, 0 or 1).

However, the actions "Left" or "Right" can be taken in a discrete number of "degrees". For example, I can take action "Left" with degree 70% , or take action "Right" with degree 16%.

Assuming a discrete action space between 0-100% for each "Left" or "Right", making the total action space a discrete size of 201 (0-200 in increments of 1), does an agent learn the optimal degree to take either "Left" or "Right" in any given state?

Нет правильного решения

Лицензировано под: CC-BY-SA с атрибуция
Не связан с datascience.stackexchange
scroll top