Why don't the gym environments come with "valid actions"? The normal gym environment accepts as input any action, even if it's not even possible.

Is this a normal thing in reinforcement learning? Do the models really have to learn what valid actions are all the time? Would it not be much nicer to have a env.get_valid_actions() functions so that the model knows what actions are doable? Or is this somehow possible and I'm missing it?

没有正确的解决方案

许可以下: CC-BY-SA归因
scroll top