Question

I am playing around with the openai gym to try and better understand reinforcement learning. One agent parameter you can modify is the action space i.e. the specific actions an agent can take in an environment at each state e.g. "Left", "Right", "Up" or "Down" if the environment is a game with 4 discrete actions.

In my research, I have not found anywhere that explicitly states that an RL model, specifically PPO2, will take longer to train if the action space is larger.

All else being held the same; same data, same environment, same hyperparameters, same hardware, will a model with a larger action space (more possible actions) take longer to train (1 episode) than a model with a smaller action space?

(e.g. will an agent with 100 possible actions take longer to train 1 episode than an agent with 2 possible actions?)

Intuitively, I would have thought that the more actions an agent has, the more "choice" it has at each state, and therefore it would take longer in choosing in one of those actions. But again, I haven't found anything proving or disproving this.

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top