Monte Carlo with UCB applied to complex card game
-
03-07-2021 - |
Question
I'm trying to understand how the MCTS Algorithm works and how I would implement it in a card game to improve the AI engine.
I've read mcts.ai/ website and many papers about it, including one that shows some results about the successfulness of applying Monte Carlo Search with UCB in the AI for a Magic cards game, that is more or less what I need to do, however I'm having some trouble trying to understand some points and how to apply it so solve what I need. I'm also not that much experienced in maths so I become lost when the papers explain all that stuff with complicated formulas.
This is what I've came up with so far:
Given a game state (user hand in the game), determine which are all the possible legal plays that can be made then I would create a List of Nodes (one representing every play) as a property in the MCTSTree's root node with each one's result (score value?)
Simulate a complete (until the end) gameplay for each one of those legal plays with a random player and record the result in every node, whether the player has won or lost in order to have a full picture.
Here is where "I think" Monte Carlo + UCB should be applied:
Select the more promising play (node) using UCB recursively and in case its leaf, expand that node with all possible plays from its gameState.
Simulate n playouts from the selected node until a certain amount of time is reached.
- At this stage I have some doubts... say I try a random playout given a list of possible playouts... what do I have to do with that first result in order to continue simulating? Should I make the tree grow then?
How do I backpropagate the results?
Then,
Having in mind that as this is a complex cards game and I have so many possible moves... would it has a good-enough performance to have so many childs in any node?
If every simulation is based on a gamestate and the game is changing the state every time a player applies a movement then how can I know if the tree is really useful?
I would really appreciate any help on this.
Thank you very much!
Solution
MCTS is just the following:
I describe it a bit differently than what the image suggests, which might be more ready for implementation.
- Descent from your root node (the current state of the game), using UCB at each step until you decide upon an uninstantiated node
l
. (Select) - Add
l
to your tree. (Expand) - From
l
, play a random game. (Simulate) - Update all the nodes on the path from
l
back to the root node with the result of the playout. - Repeat until time runs out.
If your branching factor is large, as you mentioned, you might need to consider other strategies for selecting a successor while descending the tree, like RAVE.