Вопрос

Why don't the gym environments come with "valid actions"? The normal gym environment accepts as input any action, even if it's not even possible.

Is this a normal thing in reinforcement learning? Do the models really have to learn what valid actions are all the time? Would it not be much nicer to have a env.get_valid_actions() functions so that the model knows what actions are doable? Or is this somehow possible and I'm missing it?

Нет правильного решения

Лицензировано под: CC-BY-SA с атрибуция
Не связан с datascience.stackexchange
scroll top