Question

I'm new in neural networks and i'm designing a feed forward neural network to learn to play the game checkers. As input, the board has to be given and the output should give a chance to win and lose. But how can be the ideal transformation of the checkers board to a row of numbers for input? There are 32 possible squares and 5 different possibility (king or piece of white or black player and free position) on each square. If i provide a input unit for each possible value for each square, it will be 32 * 5. Another option is that:

  Free Position: 0 0

  Piece of white: 0 0.5 && King Piece of white: 0 1

  Piece of black: 0.5 1 && King Piece of black: 1 0

In this case, input length will be just 64 but i'm not sure which one will give a better result?

Était-ce utile?

La solution 3

I've tried all possibilities and intuitive i can say that the most great idea is separating all possibilities for all squares. Thus, concrete:

0 0 0: free
1 0 0: white piece
0 0 1: black piece
1 1 0: white king
0 1 1: black king

It is also possible to enhance other parameters about the situation of the game like the amount of pieces under threat or amount of possibilities to jump.

Autres conseils

In case anyone is still interested in this topic—I suggest encoding the Checkers board with a 32 dimensional vector. I recently trained a CNN on an expert Checkers database and was able to acheive a suprisingly high level of play with no search, somewhat similar (I suspect) to the supervised learning step that Deepmind used to pretrain AlphaGo. I represented my input as an 8x4 grid, with entries in the set [-3, -1, 0, 1, 3] corresponding to an opposing king, opposing checker, empty, own checker, own king, repsectively. Thus, rather than encoding the board with a 160 dimensional vector where each dimension corresponds to a location-piece combination, the input space can be reduced to a 32-dimensional vector where each board location is represented by a unique dimension, and the piece at that location is encoded by a set of real numbers—this is done without any loss of information.

The more interesting question, at least in my mind, is which output encoding is most conducive for learning. One option is to encode it in the same way as the input. I would advise against this having found that simplifying the output encoding to a location (of the piece to move) and a direction (along which to move said piece) is much more advantageous for learning. While the reasons for this are likely more subtle, I suspect it is due to the enormous state space of checkers (something like 50^20 board possitions). Considering that the goal of our predictive model is to accept an input containing an enourmous number of possible states, and produce one ouput (i.e., move) from (at-most) 48 possibilities (12 pieces times 4 possible directions excluding jumps), a top priority in architecting a neural network should be matching the complexity of its input and output space to that of the actual game. With this in mind, I chose to encode the ouput as a 32 x 4 matrix, with each row representing a board location, and each column representing a direction. During training I simply unraveled this into a 128 dimensional, one-hot encoded vector (using argmax of softmax activations). Note that this output encoding lends itself to many invalid moves for a given board (e.g., moves off the board from edges and corners, moves to occupied locations, etc..)—we hope that the neural network can learn valid play given a large enough training set. I found that the CNN did a remarkable job at learning valid moves.

I’ve written more about this project at http://www.chrislarson.io/checkers-p1.

I've done this sort of thing with Tic-Tac-Toe. There are several ways to represent this. One of the most common for TTT is have input and output that represent the entire size of the board. In TTT this becomes 9 x hidden x 9. Input of -1 for X, 0 for none, 1 for O. Then the input to the neural network is the current state of the board. The output is the desired move. Whatever output neuron has the highest activation is going to be the move.

Propagation training will not work too well here because you will not have a finite training set. Something like Simulated Annealing, PSO, or anything with a score function would be ideal. Pitting the networks against each other for the scoring function would be great.

This worked somewhat well for TTT. I am not sure how it would work for Checkers. Chess would likely destroy it. For Go it would likely be useless.

The problem is that the neural network will learn patters only at fixed location. For example jumping an opponent in the top-left corner would be a totally different situation than jumping someone in the bottom left corner. These would have to be learned separately.

Perhaps better is to represent the exact state of the board in position independent way. This would require some thought. For instance you might communicate what "jump" opportunities exist. What move-towards king square opportunity's exist, etc and allow the net to learn to prioritize these.

Please see this thesis Blondie24 page 46, there is description of input for neural network.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top