Domanda

I have just begun working on a project that involves some scheduling optimization, and I am worried that I am in math waters over my head. I was wondering if you could think of any clever way to do the following.

Here's the basics:

  • You have x number of time slots
  • You have y number of faculty interviewers
  • You have z number of applicants interviewing
  • x and y do not have to be equal (there can be different numbers of interviewers and interviewees)
  • It is possible for a time slot to have no one being interviewed.
  • These make up a table as a schedule where the Row headings are the interviewers (represented by numbers), the Column headings are the time slots (represented by numbers), and the cells themselves are the applicants who are interviewing.

Constraints:

  • A z digit can only appear once in each row/column like Sudoku (because an applicant can't interview with the same interviewer twice and he/she can't interview twice at once).
  • There has to be exactly 3 of each z number in the entire table (because all applicants have to interview exactly 3 times).

Eventually, some formula will be used to calculate a weighted score for each valid configuration to determine a "best" state of the board. For explanation's sake, we'll just say it's something like 2a + 5b where a and b are different kinds of qualities interviewers have in common with applicants, but this isn't important for now.

I was trying to come up with some way of generating all possible valid configurations and eventually calculating the score of each valid configuration, finally picking the one with the highest score. I ran into some issues in finding a clever way to do so.

Initially, I wasn't feeling clever and thought about brute forcing it, but that is basically impossible because of the great variety of possibilities.

Can you think of any more clever way for me to prune back some of the erroneous configurations or a clever way to just generate valid boards for weighted score checking?

È stato utile?

Soluzione

Here is a quite simple Constraint Programming model in MiniZinc which - given the proper solver - generates all the possible valid solutions for (my interpretation of) the problem. However, as @TimChippingtonDerrick points out, for more complex problem instances there are too many solutions.

That said, if the objective (obj = "2a + 5b") somehow can be stated in the model (e.g. using some extra data of the interviewer/applicants etc), it's just a matter of changing it to an optimization problem, i.e. minimizing or maximizing that objective.

So, here is the MiniZinc model for a simple problem instance (x=10, y=5, z=10, and m=3). Note: in the interview table, the value 0 (zero) represents that there is no interview for the interviewer at that time.

include "globals.mzn"; 

int: x = 10; % number of time slots
int: y =  5; % number of faculty interviewers
int: z = 10; % number of applicants interviewing

int: m = 3; % number of times each applicants will be interviewed

% decision variables
% The interview table:
%   1..y: rows (interviewers)
%   1..x: columns (time slots)
array[1..y, 1..x] of var 0..z: s;

% solve satisfy;
solve :: int_search([s[i,j] | i in 1..y, j in 1..x], first_fail, indomain_random, complete) satisfy;

constraint

    % A z digit can only appear once in each row/column like Sudoku (because an applicant 
    % can't interview with the same interviewer twice and he/she can't interview twice at once).
    forall(a in 1..z) (
       % rows
       forall(i in 1..y) (
          %sum([bool2int(s[i,j] = a) | j in 1..x]) <= 1
          alldifferent_except_0([bool2int(s[i,j] = a) | j in 1..x])
       )
       /\
       % columns
       forall(j in 1..x) (
          % sum([bool2int(s[i,j] = a) | i in 1..y]) <= 1
          alldifferent_except_0([bool2int(s[i,j] = a) | i in 1..y])
       )
    )

    % There has to be exactly 3 of each z number in the entire table 
    % (because all applicants have to interview exactly 3 times).
    /\
    forall(a in 1..z) (
       count([s[i,j] | i in 1..y, j in 1..x], a, m)
    )
;

% constraint
%   % symmetry breaking: assign applicant a to interviewer a in time a (if possible)
%   forall(a in 1..z where a <= y) (
%      s[a,a] = a
%   )
% ;


output [
   "x (number of time slots (rows): " ++ show(x) ++ "\n" ++
   "y (number of interviews (columns): " ++ show(y) ++ "\n" ++
   "z (number of applicants): " ++ show(z) ++ "\n"
]
++
[
  if j = 1 then "\n" else " " endif ++
    show_int(2,s[i,j])
  | i in 1..y, j in 1..x
];

For this problem (x=10, y=5, z=10) there are a huge number of solutions. Here are two of them:

x (number of time slots (rows): 10
y (number of interviews (columns): 5
z (number of applicants): 10

5  2 10  8  0  6  7  4  3  9
3  1  5  7  0  2  9  6  8  4
8  0  0  5  6  1 10  3  0  7
2 10  1  0  0  9  0  0  0  0
0  0  0  0  0  4  0  0  0  0
----------
x (number of time slots (rows): 10
y (number of interviews (columns): 5
z (number of applicants): 10

 5  2 10  8  0  6  7  4  3  9
 3  1  5  7  0  2  9  6  8  4
 8  0  0  5  6  1 10  3  0  7
 2 10  0  1  0  9  0  0  0  0
 0  0  0  0  0  4  0  0  0  0

(This MiniZinc model is also here: http://www.hakank.org/minizinc/interview_scheduling.mzn )

A numerical comment on a simpler problem (x=6,y=3,z=3, m=3) to show the multitude of the problem: There are 317760 different solutions when not using the symmetry breaking (which is commented out in the code). With the symmetry breaking it's far less solutions: 1300. Here's one of these solutions:

1  0  0  3  2  0
3  2  1  0  0  0
0  1  3  2  0  0

Altri suggerimenti

Its sort of unstated in the replies above, probably because it is obvious to some of us, but just trying to enumerate all the legal combinations will only work in very small test cases of this sort of problem. You will quickly hit the wall in computation time if you attempt a naive implementation.

As suggested by others, you will need to look into using some optimisation tools if you want to do this for real-scale problems. Either one of the free solvers like COIN-OR, glpk or lpsolve; or one of the commercial tools - they can get expensive but if you have a big enough problem, then they are easily worth their cost.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top