Othello Alpha-beta podar jugando mal Python

https://stackoverflow.com/questions/8826230

27-10-2019
|

Pregunta

Actualmente estoy tratando de hacer una buena IA para Othello, y lo he hecho usando el algoritmo Minimax. Sin embargo, cuando intenté hacer una búsqueda más profunda usando la poda alfa-beta, parecía que el algoritmo estaba jugando terriblemente. Lo revisé con otras fuentes como Wiki y Berkely.edu, y creo que lo he implementado correctamente, pero aún no puedo encontrar el problema.

def alphabeta(board, player, a, b, lev):
        h = heur(board, player)
        if lev == 0:
                return h, None
        poss = get_legal_moves(board, player)
        if len(poss) == 0:
                return h, None
        move = 0
        for x in poss:
                cpboard = board[:]
                cpboard[x] = player
                bracket(cpboard, player, x)
                a1, q = alphabeta(cpboard, opponent_color(player), a, b, lev-1)
                if player is me:
                        if a1 > a:
                                a, move = a1, x
                else:
                        if a1 < b:
                                b, move = a1, x
                if b <= a:
                        break
        if player is me:
                return a, move
        else:
                return b, move

Solución

Su código alfa-beta probablemente sea incorrecto. Tenga en cuenta lo que sucede cuando un jugador 'pasar el turno' (es decir, no tiene movimientos disponibles), tuve un error complicado en mi código, debido a esto.

¿Llamaste a la recursión con los valores alfa y beta conmutados? El mío funciona así (código Java):

private float minimax(OthelloBoard board, OthelloMove best, float alpha, float beta, int depth)
{
    float bestResult = -Float.MAX_VALUE;
    OthelloMove garbage = new OthelloMove();

    int state = board.getState();
    int currentPlayer = board.getCurrentPlayer();

    if (state == OthelloBoard.STATE_DRAW)
        return 0.0f;
    if ((state == OthelloBoard.STATE_BLACK_WINS) && (currentPlayer == OthelloBoard.BLACK))                    
        return INFINITY;        
    if ((state == OthelloBoard.STATE_WHITE_WINS) && (currentPlayer == OthelloBoard.WHITE))
        return INFINITY;
    if ((state == OthelloBoard.STATE_BLACK_WINS) && (currentPlayer == OthelloBoard.WHITE))
        return -INFINITY;
    if ((state == OthelloBoard.STATE_WHITE_WINS) && (currentPlayer == OthelloBoard.BLACK))
        return -INFINITY;

    if (depth == maxDepth)
        return OthelloHeuristics.eval(currentPlayer, board);

    ArrayList<OthelloMove> moves = board.getAllMoves(currentPlayer);

    for (OthelloMove mv : moves)
    {            
        board.makeMove(mv);
        alpha = - minimax(board, garbage, -beta, -alpha, depth + 1);
        board.undoMove(mv);

        if (beta <= alpha)
            return alpha;
        if (alpha > bestResult)
        {                
            best.setFlipSquares(mv.getFlipSquares());
            best.setIdx(mv.getIdx());        
            best.setPlayer(mv.getPlayer());
            bestResult = alpha;
        }
    }

     return bestResult;
}

La llamada es como:

 OthelloMove bestFound = new OthelloMove();
 int maxDepth = 8;
 minimax(board, bestFound, -Float.MAX_VALUE, Float.MAX_VALUE, maxDepth);
//Wait for Thread to finish
 board.makeMove(bestFound);

Editar: en caso de que un jugador no tenga movimientos disponibles, getAllMoves () devuelva un 'movimiento ficticio', eso no cambia el tablero en absoluto, simplemente pase el giro.

¡Espero eso ayude!

Otros consejos

Su implementación de alfabeta me parece sólida. Dado que Minimax y Alphabeta producen los mismos resultados cuando se implementan correctamente, debería poder usar su código Minimax anterior como una verificación contra Alphabeta, al menos para profundidades de búsqueda modestas. Si sus resultados difieren al buscar el mismo árbol de juego, entonces sabes que has hecho algo mal.

Pero lo más probable es que el mal juego sea el resultado de algo en su función de evaluación "Heur".

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow