Question

Suppose you have a set of jobs which can possibly be done in parallel. Each job has a time requirement ( time requirement for the i th job is t_i ). There are also some dependencies, the i th of them saying that you have to do job u_i before job v_i. You have to minimize the total time required.

This is easily done by converting these relations into a directed acyclic graph and then using it to determine which ones to do in parallel.

In case I'm not clear, here is an example. Suppose you have 5 jobs with time requirements 2, 9, 3, 12, 5, and you have to do 3 before 5, 4 before 5, 3 before 1 and 1 before 2. Then the best you can do is 17. This is your dag:

+---> 1 (2) ---> 2(9)
|
3 (3)
|
+----> 5 (5)
       ^
       |
4 (12)-+

You can do 3 and 4 in parallel, so that you spend MAX(3,12)=12 units of time before doing 5, which takes 5 units of time. So 5 is completed after 17 units of time. On the other hand 2 is completed after 14 units. So the answer is 17.

The question is, if exactly one time requirement in updated in each query (where everytime you start with the original graph, not the graph resulting after the previous modifications) how can you efficiently find the new minimum time requirement?

For those who want constraints, number of jobs <= 10^5. number of dependencies <= 10^6, number of queries <= 10^6.

Was it helpful?

Solution

The time requirement is the maximum weight of a path in the acyclic directed graph. Two linear-time traversals yield, for each node, the maximum weight of a path ending with that node and the maximum weight of a path starting with that node. Now we can find the maximum length of a path including a specified node. If a node's weight increases, then we take the maximum of the previous maximum and the new maximum involving that node, which we can compute in constant time.

Decreases are trickier, since we need, for each node in the maximum weight path, the maximum weight of a path not including that node. The first way (maybe not the best) that I can think of to do this is the following. Add to the acyclic directed graph a source and a sink, both with weight zero and connections to (source) or from (sink) each of the other nodes. Number the nodes in topological order, where the source is 0 and the sink is n + 1, and initialize a segment tree mapping nodes to path weights, where the initial values are -infinity. This segment tree has the following logarithmic-time operations.

Weight(i) - returns the value for node i
Update(i, j, w) - updates the value for nodes i..j to
                  the maximum of the current value and w

For each arc from i to j, call Update(i + 1, j - 1, w), where w is the maximum weight of a path that includes the arc from i to j. At the end, each weight in the segment tree is the maximum weight of a path excluding the corresponding node.

(Note on the running time: by treating nodes without dependencies separately, the running time of this approach can be made to be O(m log n + n + q), where n is the number of nodes, m is the number of dependencies, and q is the number of queries. My segment tree sort of is computing 3D maxima, a problem well studied by computational geometers. With presorted input (at least in two dimensions), the fastest known algorithm for n points is O(n log log n), via van Emde Boas trees. There also exist algorithms with output-sensitive time bounds better in some cases than that worst-case bound.)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top