Roulette Wheel Selection for Genetic Algorithm in Java

https://stackoverflow.com/questions/12774014

05-07-2021
|

Question

I am implementing a roulette wheel selection method for a genetic algorithm. My question, in essence, is pretty simple but I can't wrap my mind around it. In my fitness function, if an answer is extremely wrong it may return around -3000%. My problem is, when I attempt to assign probabilities for my results, they get skewed toward the wrong answers.

For example: If my percentages are in an array and are [92, 68, 5, -4, -3546] (from high to low) I need to give the numbers in the lower indices a greater chance of being selected than the numbers with higher indices.

Ignoring my fitness function, how do I create a probability based on this taking into account large negative numbers?

Some basic code I've tinkered with I found in another question:

public Individual rouletteWheelSelection() { 
    double randNum = m_rand.nextDouble() * this.totalFitness; 
    int idx; 
    for (idx=0; idx<POP_SIZE && randNum>0; ++idx) { 
        randNum -= m_population[idx].getFitnessValue(); 
    } 
    return m_population[idx-1]; 
}

(original link here: GA written in Java)

I had my GA working for a different selection method, but now I'm trying to modify this one to work instead. Any help would be greatly appreciated.

***Edit

The following code is my rouletteWheelSelection I've modified:

private Chromosome rouletteWheelSelection(){
    double randNum = Math.abs(rand_num.nextDouble() * totalFitness);
    int idx;
    for (idx=0;idx<NUM_CHROMOSOMES && randNum>0;++idx){
        randNum -= Math.abs(population[idx].getFitness());
    }
    return population[NUM_CHROMOSOMES-idx];
}

Here is my fitness function:

public double getFitness()
{
    String working = bitString;
    int x1 = Integer.parseInt(working.substring(0,6),2);
    int x2 = Integer.parseInt(working.substring(6),2);
    double result = ScratchGA.functionTest(x1,x2);
    double percentAccuracy = (1- Math.abs(((ScratchGA.getDesired() - result)/ScratchGA.getDesired())))*100;
    if (percentAccuracy <= 100)
    {
    return percentAccuracy;
    }
    else
    {
    return -percentAccuracy;
    }
}

The thought was that is a value was more than 100% different from what I needed, I made it negative to shove to the end of my sorted list.

Solution

The selection method shown in the question implicitly works only with positive or null fitness values.

With negative values, a first question arise with regards to computing the totalFitness: is this the algebraic sum of the fitness values or should it work with the absolute values thereof.

A more serious issue arises when the randNum is [supposed to be] decreased but somehow the negative fitness values result in re-growing RandNum.

A suggestion would be to alter the fitness function so that it only returns positive values.

A simple approach would be something like:

if (fitValue >= -5000)
  fitValue += 5000;
else
  fitvalue = 0;

Where -5000 is is arbitrarily chosen as the most negative value you would consider meaningful. In effect, this provide a form of truncation selection for the very least plausible solutions, something you are trying to avoid with roulette wheel, but apparently the current fitness function appears strongly skewed toward the negative side of the range (or possibly even unbound on the negative side).

Edit in view of added snippets in question and your remarks
Effectively, by working with Abs. values your version of rouletteWheelSelection() takes care of the "more serious" issue listed in my initial response.
However the getFitness() function, as suspected is very skewed in favor of negative values. Its operating range is [some_potentially_very_negative_value, +100].
See the code: the biggest value returned is +100, but there is a possibility for returning mighty big negative values when the value for ScratchGA.functionTest(x1,x2) is very different from the ScratchGA.getDesired() value.
There appears to be the need for some normalization of sorts, to prevent negative returns to be so much bigger than 100 (in absoute value).

This BTW, explains very well why, with such a fitness function, the rouletteWheelSelection() favors poor performing chromosomes.

Imagine for example that you have a population of 5 chromosomes with respective fitness value of 80, 70, 30, 20 and -250. The sum is 450, with 200 for all four chromosomes with a positive fitness and 250 for the one chromosome with a negative fitness. In this example instance, there is better than even chance to pick the worse of the chromosomes!
The idea behind roulette wheel selection is offer the possibility of selecting chromosomes with less than optimal fitness, but the probability of selecting any chromosome should be proportional to the amount the chromosome contributes to the overall sum of fitness values. The implementation you have effectively does this but the problem is that the value contributed to the sum for negative fitnesses appears disproportionate to what positive fitness values provide.

OTHER TIPS

You could use windowing in that you always add or subtract the populations worst fitness. So that the range for selection expands from 0 into positive values. Worst individual will never have a chance of being selected (similar to tournament selection). Because if you don't window your values then the individual with fitness 98 will have almost the same selection pressure to the one with 95 and 96. That's fine as long as your population includes lower quality solutions, but when all solutions are in the 90s selection pressure will decrease significantly. As your population converges to an optimal solution you will more and more act like random search. You can only make a directed exploration if you consider the finer and finer details (differences) in your population.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow