Solve crossword puzzle with genetic algorithm, fitness, mutation

https://stackoverflow.com/questions/5859742

28-10-2019
|

Question

I'm trying hard to do a lab for school. I'm trying to solve a crossword puzzle using genetic algorithms. Problem is that is not very good (it is still too random) I will try to give a brief explanation of how my program is implemented now:

If i have the puzzle (# is block, 0 is empty space)

#000
00#0
#000

and a collection of words that are candidates to this puzzle's solution. My DNA is simply the matrix as a 1D array.

My first set of individuals have random generated DNAs from the pool of letters that my words contains.

I do selection using roulette-selection. There are some parameters about the chance of combination and mutations, but if mutation will happen then i will always change 25% of the DNA. I change it with random letters from my pool of letters.(this can have negative effects, as the mutations can destroy already formed words)

Now the fitness function: I traverse the matrix both horizontaly and verticaly: If i find a word then FITNESS += word.lengh +1

If i find a string that is a part of some word then FITNESS += word.length / (puzzle_size*4) . Anyway it should give a value between 0 and 1. So it can find "to" from "tool" and ads X to FITNESS, then right after it finds "too" from "tool" and adds another Y to FITNESS.

My generations are not actually improving over time. They appear random. So even after 400 generations with a pool of 1000-2000 (these numbers dont really matter) i get a solution with 1-2 words (of 2 or 3 letters) when the solution should have 6 words.

Solution

I think your fitness function might be ill-defined. I would set this up so each row has a binary fitness level. Either a row is fit, or it is not. (eg a Row is a word or it is not a word) Then the overall fitness of the solution would be #fit rows / total rows (both horizontally and vertically). Also, you might be changing too much of the dna, I would make that variable and experiment with that.

OTHER TIPS

Your fitness function looks OK to me, although without more detail it's hard to get a really good picture of what you're doing.

You don't specify the mutation probability, but when you do mutate, 25% is a very high mutation. Also, roulette wheel selection applies a lot of selection pressure. What you often see is that the algorithm pretty early on finds a solution that is quite a bit better than all the others, and roulette wheel selection causes the algorithm to select it with such high probability that you quickly end up with a population full of copies of that. At that point, search halts except for the occasional blindly lucky mutation, and since your mutations are so large, it's very unlikely that you'll find an improving move without wrecking the rest of the chromosome.

I'd try binary tournament selection, and a more sensible mutation operator. The usual heuristic people use for mutation is to (on average) flip one "bit" of each chromosome. You don't want a deterministic one letter change each time though. Something like this:

for(i=0; i<chromosome.length(); ++i) {
    // random generates double in the range [0, 1)
    if(random() < 1.0/chromosome.length()) {
       chromosome[i] = pick_random_letter();
    }
}

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow