A good Minimax representation in Gomoku?

https://stackoverflow.com/questions/5469914

14-11-2019
|

Pregunta

I am trying to code a Gomoku (five in a row) game in Java as an individual project. For the AI, I understand that the use of a Minimax function with Alpha-beta Pruning is a good way to approach this. However, I'm having a little trouble envisioning how this would work.

My question is this: what is a good representation for a node in a minimax tree?

I'm thinking my evaluation function will "weight" all the empty spaces on the board. It will then take the best value from that board as the node of the minmax decision tree. Am I on the right direction?

And any other tips are also welcome! Thanks in advance!

Solución

The state space search is through the different states of the board. There are a lot of moves, since you can place a stone anywhere unoccupied. Each state can be represented as a e.g. 9x9 matrix, with 3 values -- white, black, or unoccupied. With a 9x9 board, there are therefore 3 ^ 81 possible board states.

From any board state, the number of moves is the number of unoccupied vertices. You can place a stone on any of these vertices. You can only play your own color. So, at most there are 81 possible moves .. 81 for the first move, 80 for the second, and so on. So you can search to depth 5 reasonably, and possibly more .. not too bad.

The proper representation is, as mentioned, a 2D matrix -- this can just be a 2D array of ints, with values e.g. 0 for unoccupied, 1 for white, 2 for black. ... int[9,9].

Your evaluation function doesn't sound very good. Instead, I would give points for the following:

-- get 5 in a row -- basically give it the max score for this one, since it's a win -- 4 in a row with 2 open ends -- also max score, since opponent can't block you from getting 5. -- 4 in a row with 1 open end -- still a very threatenning position, since opponent has to play in one spot to block. -- 3 in a row with 2 open ends -- very high score again --- 4, 3, 2, 1 with both closed ends -- 0, since can't ever make 5 in a row.

and so on.

Then, you just apply the standard minimax algorithm -- i.e. alpha beta pruning -- it would be exactly the same as chess, but you have a different state space generator and evaluation function.

Otros consejos

I'd consider an evaluation function of the following form: consider each set of, say, 6 positions in a line. (On a 19x19 board there are 14 along each line and varying numbers from 0 to 14 along each diagonal; I think that comes to 742 of these on the whole board. My arithmetic may be wrong.) For each set there are 729 possible arrangements of black, white and empty spaces. Or, er, 378 if you take end-to-end symmetry into account. Or, er, um, fewer than that but I can't be bothered to work out how many fewer if you take black/white symmetry into account too.

So now your evaluation function will consist of a table-lookup for each block of 6 stones, in a 378-or-however-many-element table (or perhaps two of them, one for horizontal and vertical lines, one for diagonal ones). Add up the results and that's your evaluation of the position.

It may turn out that actually a larger table (derived from a longer row of positions) works better.

But what goes in the table? Let your program work that out. Start with arbitrary values in the table (you might, e.g., take eval(line) = #black(line)-#white(line) or something). Let your program play itself, using alpha-beta search. Now update the table entries according to what happens. There are many different ways of doing this; here are a (sketchily-described) few.

During each game, keep track of how many times each pattern occurred in each player's positions. When the game's over, adjust each pattern's score so that patterns seen more often by the winning player look better.
Each time you do a search, adjust the scores for the patterns in the current position to bring the current static score nearer to the score obtained by search.
Each time a move is made, adjust the scores for each pattern in the "before" position to make the "before" score match the "after" score better.
Have lots of different tables (hence lots of different variants of the evaluation function). Let them play against one another. Apply some sort of evolution (e.g., play all against all, then throw out the worst performers and replace them with mutants derived from the better performers).

For a more sophisticated version of these ideas (applied to chess, but the same ideas would work fine for gomoku), take a look at http://cs.anu.edu.au/~Lex.Weaver/pub_sem/publications/knightcap.pdf .

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow