Question

Why don't the gym environments come with "valid actions"? The normal gym environment accepts as input any action, even if it's not even possible.

Is this a normal thing in reinforcement learning? Do the models really have to learn what valid actions are all the time? Would it not be much nicer to have a env.get_valid_actions() functions so that the model knows what actions are doable? Or is this somehow possible and I'm missing it?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top