Question

I'm working on pipeline network optimization, and I'm representing the chromosomes as a string of numbers as following

example

chromosome [1] = 3 4 7 2 8 9 6 5

where, each number refers to well and the distance between wells are defined. since, the wells cannot be duplicated for one chromosome. for example

chromosome [1]' = 3 4 7 2 7 9 6 5 (not acceptable) 

what is the best mutation that can deal with a representation like that? thanks in advance.

Was it helpful?

Solution

Can't say "best" but one model that I've used for graph-like problems is: For each node (well number), calculate the set of adjacent nodes / wells from the entire population. e.g.,

population = [[1,2,3,4], [1,2,3,5], [1,2,3,6], [1,2,6,5], [1,2,6,7]]
adjacencies = { 
  1 : [2]         ,   #In the entire population, 1 is always only near 2
  2 : [1, 3, 6]   ,   #2 is adjacent to 1, 3, and 6 in various individuals
  3 : [2, 4, 5, 6],   #...etc...
  4 : [3]         ,
  5 : [3, 6]      , 
  6 : [3, 2, 5, 7],
  7 : [6]         
}
choose_from_subset = [1,2,3,4,5,6,7] #At first, entire population

Then create a new individual / network by:

 choose_next_individual(adjacencies, choose_from_subset) : 
   Sort adjacencies by the size of their associated sets
   From the choices in choose_from_subset, choose the well with the highest number of adjacent possibilities (e.g., either 3 or 6, both of which have 4 possibilities)
   If there is a tie (as there is with 3 and 6), choose among them randomly (let's say "3")
   Place the chosen well as the next element of the individual / network ([3])
   fewerAdjacencies = Remove the chosen well from the set of adjacencies (see below)
   new_choose_from_subset = adjacencies to your just-chosen well (i.e., 3 : [2,4,5,6])
   Recurse -- choose_next_individual(fewerAdjacencies, new_choose_from_subset)

The idea is that nodes with high numbers of adjacencies are ripe for recombination (since the population hasn't converged on, e.g., 1->2), a lower "adjacency count" (but non-zero) implies convergence, and a zero adjacency count is (basically) a mutation.

Just to show a sample run ..

#Recurse: After removing "3" from the population
new_graph = [3]
new_choose_from_subset = [2,4,5,6] #from 3 : [2,4,5,6] 
adjacencies = { 
  1: [2]             
  2: [1, 6]      ,  
  4: []          ,
  5: [6]         , 
  6: [2, 5, 7]   ,
  7: [6]         
}


#Recurse: "6" has most adjacencies in new_choose_from_subset, so choose and remove
new_graph = [3, 6]
new_choose_from_subset = [2, 5,7]    
adjacencies = { 
  1: [2]             
  2: [1]         ,  
  4: []          ,
  5: []          , 
  7: []          
}


#Recurse: Amongst [2,5,7], 2 has the most adjacencies
new_graph = [3, 6, 2]
new_choose_from_subset = [1]
adjacencies = { 
  1: []              
  4: []          ,
  5: []          , 
  7: []          
]

#new_choose_from_subset contains only 1, so that's your next...
new_graph = [3,6,2,1]
new_choose_from_subset = []
adjacencies = {
  4: []          ,
  5: []          , 
  7: []          
]

#From here on out, you'd be choosing randomly between the rest, so you might end up with:
new_graph = [3, 6, 2, 1, 5, 7, 4] 

Sanity-check? 3->6 occurs 1x in original, 6->2 appears 2x, 2->1 appears 5x, 1->5 appears 0, 5->7 appears 0, 7->4 appears 0. So you've preserved the most-common adjacency (2->1) and two other "perhaps significant" adjacencies. Otherwise, you're trying out new adjacencies in the solution space.

UPDATE: Originally I'd forgotten the critical point that when recursing, you choose the most-connected to the just-chosen node. That's critical to preserving high-fitness chains! I've updated the description.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top