Pergunta

I am trying to understand how pure Monte Carlo tree search (https://en.wikipedia.org/wiki/Monte_Carlo_tree_search) handles games where random playout will likely result in a loss easily avoided by other types of agents.

For example, imagine a hypothetical game with a single player. At game start, the player can choose between move A or move B.

Move A results in a fair coin flip that determines win/loss.

Move B results in a second set of 10 possible moves for player A, one of which is a win (move C, let's call it), and the other 9 a loss.

Random playout will have move A winning 50% of the time, and Move B winning 10% of the time, even though the ability to always pick the winning move B->C is within the player's power.

Can MCTS handle games of this nature well? How does it do so?

Obviously this toy example can trivially be solved by other search methods, but the real game I'm interested in has a prohibitive branching factor for things like depth- or breadth-first search, and a proper evaluation function is difficult.

Nenhuma solução correta

Licenciado em: CC-BY-SA com atribuição
Não afiliado a cs.stackexchange
scroll top