A scoring approach to computer opponents that needs balancing

https://cs.stackexchange.com/questions/21981

31-10-2019
|

题

This question is about an approach to computer opponents that I have created and are either currently being used, or are planned to be used, in several computer games.

Background

Last year, when trying to improve a computer opponent for a game called "Minesweeper Flags" (short description: A turn-based multiplayer version of Minesweeper where you have to take more mines than your opponent), I strongly changed the way my algorithms worked. Instead of using an approach like if-else-if-else, I am using a set of "scorers" with specified weights to determine what the best move is.

You might think that for a game like Minesweeper Flags, it's only about making moves that gives you the highest probability of taking a mine, but it's not that simple. Which move the computer will make usually depends on several features for that specific move in the current game state. Examples of features:

What's the probability of this move scoring a mine?
What's the probability of revealing anything to my opponent here?

Description of the system

The system basically works like this:

"Pre-scorers": Some pre-analysis is done for the current game state (in terms of Minesweeper Flags, this is usually: Calculating all the probabilities)
"Scorers": A set of ordinary scorers are asked to determine the score for each possible move, each scorer applies scores according to it's own criteria. The scorers can check the results of the pre-analysis that was made.
The scores calculated in the above step is summed together and is set to be the score for a move.
The moves are sorted according to their score and ranked so that all moves with the same score gets the same rank.
"Post-scorers": The result of the above can be sent to "Post-scorers" that have the possibility to modify the scores of any fields in any way they want, according to the post-scorer's own rules.

When combining a bunch of pre-scorers, scorers (with their weights) and post-scorers, it's becomes what I call a score configuration.

Example result

This is an example of scores having been applied to Minesweeper Flags. This is the map that was scored:

Minesweeper Flags map that was scored

And this is the output of an actual score configuration. It is showing the rank of the possible moves, where 1 is the best rank and has been highlighted in white:

Example output of scoring approach

Thanks to having written highly flexible code, this approach to AIs can be inserted into other games as well.

Advantages and disadvantages

Below are some advantages and disadvantages of this system that I can think of myself

Advantages

It's very easy to create a whole lot of different configurations for AIs.
It is possible to use with Genetic Algorithms: Each scorer has an associated weight, the weight can become the gene.
Using some tools, it is possible to check why a specific move was made and which scorers were mainly responsible for that move
Using tools, it is possible to create a map of the overall score/rank of possible moves (like the screenshot above)
By applying scores to the way the human plays, it is possible to create an "#AI_Mirror" which tries to make moves that it thinks the human would make

Disadvantages

It can be extremely difficult to adjust a score configuration "correctly", to make the AI play as good as possible.

Questions

Is the system I have built here widely known in the AI world? What would it be called in real AI terms?
Does this approach make sense or is there a different approach that you would recommend?
What ways are there that could make the process of tweaking a score configuration easier?

Regarding the last question, I am aware of the possibility of using genetic algorithms, I am also mildly aware of SARSA (and I do think my scorers resembles that site's description of features with weights, but from my understanding that's not exactly what I have created here). I think that a problem with SARSA is that you don't know the reward until the game is over, the best move is often a move which doesn't give a reward (a mine) at all. Your current chances of winning depends on both the current score (how many mines you and your opponent have taken) and what the current map looks like.

_{This question was originally posted on a now defunct Artificial Intelligence site.}
_{The (Java) code used for this approach has now been posted at Code Review.}

没有正确的解决方案

许可以下： CC-BY-SA 和归因

不隶属于 cs.stackexchange