If the set of all possible states changes each time, how can Q-learning “learn” anything?

https://datascience.stackexchange.com/questions/51402

reinforcement-learning
q-learning

01-11-2019
|

Question

I found this resource that explains q-learning with a very simple example. Make it a 2D problem, a rectangle instead of a line, and it's still simple. The only difference is that now there are 2 more possible actions (up and down).

My question is: if the length and height of the rectangle are random, as well as the starting position and the location of the Treasure, how can the bot apply the knowledge acquired to the new problem? Is there an evolved version of q-learning for problems with dynamical-states?

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange