Saturday, April 14, 2012

Design of a Fitnessfunction

Good design of a fitnessfunction is essential for the success of a genetic algorithm. In sumosim fitness is evaluated by playing multiple matches against reference robots. The reference robot might be the same for all matches. A match might be won, undecided or lost. Undecided means that no robot won within a certain amount of time. After finishing a number of games the following parameters are collected.
  1. relWinCount: Relative number of winning matches. A number between 0 and 1 where 0 means, no matches won and 1 means all matches won.
  2. relUndecidedCount: Relative number of undecided games. Also a number between 0 and 1. 0 means no match was undecided and 1 means all matches where undecided.
  3. relLossCount: Relative number of lost games. Also a number between 0 and 1 with the same meaning as above.
  4. relDurationMeanOnWin: The mean relative duration of winning games. A value between 0 and 1. 0 means that the mean duration of all winning matches was 0 and 1 means that all won games had exactly the maximum allowed time. Generally a low value indicates a better robot than a high value because it could faster defeat the opponent. If no match was won the value is set to 0.
  5. relDurationMeanOnLoss: The mean relative duration of lost games. A value between 0 and 1. 0 means all matches where finished immediately and 1 means all matches lasted exactly the maximum allowed time. So a high value indicates a fairly good robot as it did not win but could resist longer than a robot with a low duration value. If no match is lost the value is set to 1.
The sum of relWinCount, relUndecidedCount and relLossCount must (of course) always be 1.
For the GEVA  genetc algorithm a decending fitnessfunction is needed. Meaning that high positive values indicate bad fitness and low values stand for good fittness. 0 would be the best possible fitness value.

Fitnessfunction for lost matches

To keep it simple at first a fitnessfunction for lost matches is designed. In scala it looks like the following. The values for relWinCount and relDurationMeanOnWin are 0 for the following considerations.

Here is the sourcecode in scala.

  def fitness(result: TestResult): Double = {  
   // Duration makes only sense if any match was lost or won. Otherwise take fixed value  
   val relDurationMeanOnLoss = if (almostZero(result.relLossCount)) 1.0 else result.relDurationMeanOnLoss  
   val relDurationMeanOnWin = if (almostZero(result.relWinCount)) 0.0 else result.relDurationMeanOnWin  
   // Increasing fitness function  
   val inc = 1.0 + (1.0 - result.relLossCount) * winLossFactor + relDurationMeanOnLoss +   
             result.relWinCount * winLossFactor + (1.0 - relDurationMeanOnWin)  
   // Invert to get a decreasing fitness  
   1.0 / inc  
  }  

This is actually an increasing fitnessfunction based on the parameters described before. In the last codeline the value is inverted to get a decreasing fitnessfunction.

It was now a challange to find a sensible value for the winLossFactor. Trying it out was my way.
I calculated fitnessvalues for different relLossCount, relDurationMeanOnLoss and winLossFactors.
The following diagrams show the results.


A factor of 20 favors the winning of games where a lower factor as 2 favors quick winning. As both is necessary an intermediate value of 5 will be chosen for the beginning.

No comments:

Post a Comment