Question

I am currently researching the usages of machine learning paradigms for pathfinding problems. I am currently looking into the reinforcement learning paradigm and I used QLearning for pathfinding.

When there are not many states QLearning seems to be working well, but as soon as the environment gets bigger and the amount of states gets bigger it is performing rather bad. Since the convergence of QLearning is so slow I am wondering if it is possible with QLearning to interpolate the QValue of unexplored states since QLearning does not use a model? Is it possible with reinforcement in general or does it require to learn all possible states?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top