Here's a proof-of-concept of a Constraint Programming approach. It's done in MiniZinc, (as usual when I prototyping CSP problems). I haven't implemented it in any Java system but hope it's of some use anyway.
The MiniZinc model is here: http://www.hakank.org/minizinc/bracket_partitioning.mzn
Here's the the principal approach:
The array ("x") of size 1..N is for assigning the person (x[i]) to which bracket. Symmetry breaking on "x":
- bracket 1 must be used before bracket 2 (the constraint value_precede_chain)
Another array ("set_size") of 1..N div 2 contains the number of person in each bracket.
Symmetry breaking on "set_size":
- The values in "set_size" must be in decreasing order.
Then there are three help arrays ("mins", "maxs","diffs") of size 1..N div 2 (i.e. the same same as "set_size") which includes the minimum, maximum values of each bracket, and also the difference (diff[s]) between maxs[s]-mins[s]. This difference must be within 15% (calculated as "10000*diffs[s] div maxs[s] <= 1500").
This 15% requirement is what makes the model a bit messy, but interesting.
The preference of 4 and 8 size brackets has been implemented by maximizing the number of brackets of size 4 and 8 (both have weight 1, the other bracket sizes have weight 0); this is the "z" variable. An alternative is to weight bracket size of 8 by 2 and size 4 of 1 (and all other weight 0) which thus prefer 8 size brackets over 4 size brackets.
Notes: - There are also some other constraints - implicit constraints and symmetry breaking - which tend to speed up the model, such as:
sum(set_size) = n % implicit constraint
x[1] = 1 % assign the first person to bracket 1
The code also includes some stuff for randomized testing such as rand_int_array (MiniZinc don't have that as a built-in). That can be ignored.
I don't know how large N can be in real life. Perhaps it's very large, then one have to add some more symmetry breaking etc or using another approach.
Here's an output from running the example given:
w: [100, 103, 105, 106, 110, 114, 120, 125]
z: 2
x: [1, 1, 1, 1, 2, 2, 2, 2]
set_size: [4, 4, 0, 0]
diffs: [6, 15, 0, 0]
mins: [100, 110, 0, 0]
maxs: [106, 125, 0, 0]
bracket 1: [100, 103, 105, 106]
bracket 2: [110, 114, 120, 125]
Here we see that two brackets of size 4 have the optimal value (z=2 since there are 2 size 4) as expected.
For another example with N=28, the model give this results ("w" is the array of 'random' weights).
w: [111, 109, 112, 146, 115, 103, 130, 145, 128, 127, 144, 114, 133, 126, 134, 133, 114, 134, 143, 116, 106, 104, 147, 110, 114, 102, 118, 130]
z: 7
x: [1, 1, 1, 2, 1, 3, 2, 2, 2, 4, 4, 3, 4, 4, 5, 6, 3, 5, 5, 3, 7, 7, 5, 7, 6, 7, 6, 6]
set_size: [4, 4, 4, 4, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0]
diffs: [6, 18, 13, 18, 13, 19, 8, 0, 0, 0, 0, 0, 0, 0]
mins: [109, 128, 103, 126, 134, 114, 102, 0, 0, 0, 0, 0, 0, 0]
maxs: [115, 146, 116, 144, 147, 133, 110, 0, 0, 0, 0, 0, 0, 0]
bracket 1: [111, 109, 112, 115]
bracket 2: [146, 130, 145, 128]
bracket 3: [103, 114, 114, 116]
bracket 4: [127, 144, 133, 126]
bracket 5: [134, 134, 143, 147]
bracket 6: [133, 114, 118, 130]
bracket 7: [106, 104, 110, 102]
This is solved in about 0.7s by Gecode.