Determine if two chess positions are equal

https://stackoverflow.com/questions/3888918

28-09-2019
|

Question

I'm currently debugging my transposition table for a chess variant engine where pieces can be placed (ie. originally not on the board). I need to know how often I'm hitting key collisions. I'm saving the piece list in each table index, along with the usual hash data. My simple solution for determining if two positions are equal is failing on transpositions because I'm linearly comparing the two piece lists.

Please do not suggest that I should be storing by board-centric instead of piece-centric. I have to store the piece list because of the unique nature of placable and captured pieces. Pieces in those states are like they are occupying an overlapping and position-less location. Please look at the description of how pieces are stored.

// [Piece List]
// 
// Contents: The location of the pieces.
//           Values 0-63 are board indexes; -2 is dead; -1 is placeable
// Structure: Black pieces are at indexes 0-15
//            White pieces are at indexes 16-31
//            Within each set of colors the pieces are arranged as following:
//            8 Pawns, 2 Knights, 2 Bishops, 2 Rooks, 1 Queen, 1 King
// Example: piece[15] = 6 means the black king is on board index 6
//          piece[29] = -2 means the white rook is dead
char piece[32];

A transposition happens when pieces are moved in a different order, but the end result is the same board position. For example the following positions are equal:

1) first rook on A1; second rook on D7
2) first rook on D7; second rook on A1

The following is a non-optimised general algorithm; and the inner loop is similar to another general problem, but with the added restraint that values in 0-63 will only happen once (ie. only one piece per square).

for each color:
    for each piece type:
        are all pieces in the same position, disregarding transpositions?

The following comparison does NOT work because of transpositions. What I need is a way to detect transpositions as equal and only report actually different positions.

bool operator==(const Position &b)
{
    for (int i = 0; i < 32; i++)
        if (piece[i] != b.piece[i])
            return false;
    return true;
}

Performance/memory is a consideration because the table gets over 100K hits (where keys are equal) per turn and a typical table has 1 million items. Henceforth, I'm looking for something faster than copying and sorting the lists.

Solution

"do not suggest that I should be storing by board-centric instead of piece-centric".

You're so focused on not doing that, that you miss the obvious solution. Compare board-specific. To compare two position lists L1 and L2, place all elements of L1 on a (temporary) board. Then, for each element of L2, check if it's present on the temporary board. If an element of L2 is not present on the board (and thus in L1), return unequal.

If after removing all elements of L2, there are still pieces left on the board, then L1 must have had elements not present in L2 and the lists are equal. L1 and L2 are only equal when the temporary board is empty afterwards.

An optimization is to check the lengths of L1 and L2 first. Not only will this catch many discrepancies quickly, it also eliminates the need to remove the elemetns of L2 from the baord and the "empty board" check at the end. That is only needed to catch the case where L1 is a true superset of L2. If L1 and L2 have the same size, and L2 is a subset of L1, then L1 and L2 must be equal.

OTHER TIPS

There is lot of research done on computer chess and the way to create unique hash for a position is a well know problem with an universal solution used by virtually every chess engine.

What you need to do is use Zobrist Hashing to create a unique (not really unique, but we'll see later why this is not a problem in practice) key for each different positions. The Algorithm applied to chess is explained here.

When you start your program you create what we call zobrist keys. These are 64 bits random integers for each piece/square pairs. In C you would have an 2 dimension array like this :

unsigned long long zobKeys[NUMBER_OF_PIECES][NUMBER_OF_SQUARES];

Each of this key are initialized with a good random number generator (Warning : the random number generator provided with gcc or VC++ are not good enough, use an implementation of the Mersenne Twister).

When the board is empty you arbitrarily set it's hash key to 0, then when you add a piece on the board, say a Rook on A1, you also update the hash key by XORing the zobrist key for a rook on A1 with the hash key of the board. Like this (in C) :

boardHash = boardHash ^ zobKeys[ROOK][A1];

If you later remove the rook from this square you need to reverse what you just did, since a XOR can be reversed by applaying it again, you can simply use the same command again when you remove the piece :

boardHash = boardHash ^ zobKeys[ROOK][A1];

If you move a piece, say the rook on A1 goest to B1, you need to do two XOR, one to remove the rook on A1 and one to add a rook on B2.

boardHash = boardHash ^ zobKeys[ROOK][A1] ^ boardHash ^ zobKeys[ROOK][B1];

This way everytime you modify the board you also modify the hash. It is very efficient. You could also compute the hash from scatch each time by xoring the zobKeys corresponding to all pieces on the board. You will also need to XOR the position of the pawn that can be taken en passant and the status of the rooking capabilities of both side. You do it the same way, by creating zobris keys for each possible values.

This algotitm does not guaranty that each position has a unique hash, however, if you use a good pseudo random number generator, the odds of a collision occuring are so low that even if you let your engine play you whole life there is virtually no chances of a collision ever occuring.

edit: I just red that you are trying to implement this for a variant of chess that has off-board pieces. Zobrist hashing is still the right solution for you. You will have to find a way to incorporate theses information in the hash. You could for example have some keys for the off-the-board pieces :

unsigned long long offTheBoardZobKeys[NUMBER_OF_PIECE][MAXIMUM_NUMBER_OF_ON_PIECE_TYPE];

If you have 2 paws off the board and put one of this pawn on a2, you will have to do 2 operations :

// remove one pawn from the off-the-board
boardHash = boardHash ^ offTheBoardZobKeys[WHITE_PAWN][numberOfWhitePawsOffTheBoard];

// Put a pawn on a2
boardHash = boardHash ^ zobKeys[WHITE_PAWN][A2];

Why not keep an 64 byte string in your database that corresponds to the chessboard layout? Every type of piece, including 'no piece' represents a letter (different caps for both colors, ie ABC for black, abc for white). Board comparison boils down to simple string comparison.

In general, comparing from the chessboard perspective, instead of the piece perspective, will get rid of your transpositions problem!

Your main objection to storing the states board-wise is that you have a bag of position-less pieces. Why not maintain a board + a vector of pieces? This would meet your requirements and it has the advantage that it is a canonical representation for your states. Hence you don't need the sorting, and you either use this representation internally or convert to it when you need to compare :

Piece-type in A1
... 63 more squares
Number of white pawns off-board
Number of black pawns off-board
... other piece types

From the piece perspective you could do this:

for each color:
    for each piece type:
        start new list for board A
        for each piece of this piece type on board A
            add piece position to the list
        start new list for board B
        for each piece of this piece type on board B
            add piece position to the list
        order both lists and compare them

Optimizations can come in different ways. Your advantage is: as soon as you notice a difference: your done!

You could for instance start with a quick and dirty check by summing up all the indexes for all pieces, for both boards. The sums should be equal. If not, there's a difference.

If the sums are equal, you could quickly compare the positions of the unique pieces (King and Queen). Then you could write out (in somewhat complicated if statements) the comparisons for the pieces that are in pairs. All you then have to do is compare the pawns using the above stated method.

And a third option (I really do hope posting 3 answers to one question is ok, stackoverflow-wise ;)):

Always keep your pieces of the same type in index-order, ie the first pawn in the list should always have the lowest index. If a move takes place that breaks this, just flip the pawns positions in the list. The use won't see the difference, a pawn is a pawn.

Now when comparing positions, you can be sure there's no transpositions problem and you can just use your proposed for-loop.

Given your choice of game state representation, you have to sort the black pawns' indices, the white pawns' indices, etc., one way or the other. If you don't do it in the course of creating a new game state, you will have to do it upon comparison. Because you only need to sort a maximum of 8 elements, this can be done quite fast.

There are a few alternatives to represent your game states:

Represent each type of piece as a bit field. The first 64 bits mean that there is a piece of this type on that board coordinate; then there are n bits of "placeable" and n bits of "dead" slots, which have to be filled from one side (n is the number of pieces of this type).

Give each type of piece a unique ID, e.g. white pawns could be 0x01. A game state consists of an array of 64 pieces (the board) and two ordered lists of "placeable" and "dead" pieces. Maintaining the order of these lists can be done quite efficiently upon inserting and deleting.

These two alternatives would not have a transposition problem.

Anyway, I have the impression that you are fiddling around with micro-optimizations when you should first get it to work.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow