How to edge-color a directed acyclic graph so that every path visits none or all edges of each color?

https://cs.stackexchange.com/questions/130283

29-09-2020
|

题

Given a directed acyclic graph $G$ and a start vertex $s$ and an end vertex $e$, consider a coloring of the edges valid if, for every path from $s$ to $e$ and every color $c$, either $c$ is never encountered along that path, or every edge that is colored $c$ is visited by that path.

Given $G,s,e$, I would like to find a valid coloring that uses the minimal number of colors. Is there an efficient algorithm for this problem?

I show below an example graph and a sample solution. The circle on the left is the starting vertex, the filled circle on the right is the end vertex.

解决方案

You can color a pair of arcs $(a_1,a_2)$ by the same color, if and only if all the paths from the source to the sink, passing through the arc $a_1$, also pass through the arc $a_2$.

Let's consider the set $P$ of all paths from the source to the sink in the graph $G=(V,A)$. Let's denote the subset $P(a) \subset P$ of all paths, passing through the arc $a$. Then we can define an equivalence relation on the set $A$:

$$(a_1 \sim a_2) \equiv (P(a_1) = P(a_2))$$

A minimum number of colors, needed to color all the arcs in the graph $G$ according to your restriction, will be equal to the number of equivalence classes for the relation above.

The algorithm to split all the arcs into such equivalence classes is exact, but may be slow for large graphs. It consists of two steps:

Step 1. For each arc $a \in A$ compute the subset $P(a) \subset P$. This can be done by scanning all the paths in the set $P$, and updating all the subsets $P(a)$ along each such path.

Step 2. Let's assume that we store all the subsets $P(a)$ as binary numbers. Sort the set $A$ by these numbers - this will allow us to group together all the arcs with the same subset of paths. Scan this sorted set of arcs, assigning the same color to arcs in each group.

其他提示

It sounds to me like the greedy algorithm should work, I'm not able to come up with any counter-examples, however, I haven't had time to try to prove the claim either.

Terminology

Definition. Let $s$ be the start and $t$ be the end vertices (source and sink, respectively). Let $a$ and $b$ be vertices where there is a path from $a$ to $b$ (from now on $a < b$), i.e. $s \leq a < b \leq t$. We say that an $st$-path $P = \left(s=v_1, \dots, v_\ell=t\right)$ is $(a,b)$-non-laminar if $a \in P$ and $b \notin P$, or $a \notin P$ and $b \in P$. Intuitively, this means that $P$ "branches" out between $a$ and $b$, and branches in after $b$, or branches out before $a$ and branches in between $a$ and $b$.

Let us define $\text{lca}(v)$ to be the vertex that is the lowest common ancestor of the vertices $\{u \in V \mid (u,v) \in E\}$, i.e. the lowest common ancestor of the in-neighbors of $v$.

We say that a path is $v$-non-laminar if it is $(\text{lca}(v), v)$-non-laminar.

Greedy algorithm.

(1) If a vertex has in-degree = 1 and out-degree = 1, then you use the in-arc's color for the out-arc.

(2) Every time you fan out, that is, whenever you have a vertex with out-degree at least two, each out-arc needs a new color.

(3a) Every time you fan in, i.e., there is a vertex $v$ with in-degree at least two, and there is no $v$-non-laminar path, you take the color of an in-arc of $\text{lca}(v)$.

(3b) Every time you fan in and there is a $v$-non-laminar path, you need a new color.

That should cover all possible cases, and I think it shouldn't be too difficult to show that you can do it in $O(n^2)$ time. It might be possible to lower the time to $O(n + m)$, but I can't think of it now.

I present a refinement on HEKTO's algorithm that I think works and should be more efficient: it runs in $O^*(\min(n^3,m^2))$ time.

Theory

Let $P(a)$ denote the set of paths that start at $s$, go through the arc $a$, and end at $e$.

Lemma 1. $a_1,a_2$ can be given the same color iff $P(a_1)=P(a_2)$.

Let $G^*$ be the dual graph of $G$, i.e., each arc of $G$ is a vertex of $G^*$, and for each pair of arcs $u \to v$, $v \to w$ in $G$ we connect the corresponding vertices with a directed arc in $G^*$. The start vertex of $G^*$ is a new vertex $s_0$, and it has an arc in $G^*$ to each arc $s \to v$ in $G$; and similarly for its end vertex.

Lemma 2. An arc $a_2$ is in every path of $P(a_1)$ iff $a_2$ is a dominator or post-dominator of $a_1$ in $G^*$.

Say that $a_1 \prec a_2$ if $a_1$ is the immediate dominator of $a_2$ and $a_2$ is the immediate post-dominator of $a_1$ in $G^*$.

Lemma 3. $P(a)=P(a')$ iff there exists a sequence of arcs $a_1,\dots,a_n$ such that $a=a_1 \prec a_2 \prec \cdots \prec a_n=a'$.

Algorithm

This theory immediately leads to an efficient algorithm for your problem:

Compute the dominator tree $D$ and post-dominator tree $D'$ of $G^*$.
Initialize a Union-Find data structure with each arc of $G$ in its own set.
For arc $a_1$ of $G$, let $a_2$ be its immediate dominator in $D$; if $a_1$ is the immediate dominator of $a_2$ in $D'$, call Union($a_1,a_2$).
Assign a different color to each set of the Union-Find data structure.

Running time analysis

If $G$ has $n$ vertices and $m$ arcs, then $G^*$ has $m$ vertices and $\min(n^3,m^2)$ arcs. Computing the dominator tree can be done in nearly linear time (see e.g., https://en.wikipedia.org/wiki/Dominator_(graph_theory)#Algorithms or Dominator Tree for DAG). The Union-Find algorithm can be done in nearly linear time. Thus, the running time is essentially $O(\min(n^3,m^2))$, ignoring logarithmic factors.

I wouldn't be surprised if there is a more efficient way to compute the dominator tree of $G^*$ without constructing $G^*$ explicitly, which would lead to improvements in the running time of this algorithm.

Proofs

Proof of Lemma 1. If $P(a_1) \ne P(a_2)$, there is some path that goes through $a_1$ but not $a_2$ (or vice versa), and then by the requirements, $a_1,a_2$ cannot be given the same color.

For the converse, suppose we form equivalence classes on the arcs where $a_1,a_2$ are equivalent if $P(a_1)=P(a_2)$, give each equivalence class a unique color, and color each edge according to the color of the equivalence class it is contained in. Then this satisfies all of the constraints: for any color $c$ and any two arcs $a_1,a_2$ colored $c$, we have $P(a_1)=P(a_2)$, so any path $p \in P(a_1)$ also satisfies $p \in P(a_2)$ and thus visits $a_2$; and any path $p \notin P(a_1)$ also satisfies $p \notin P(a_2)$ and thus does not visit $a_2$.

I haven't written out the proofs of Lemmas 2-3, so I recommend you do that and check my reasoning before using this algorithm.

There is a simple randomized linear-time algorithm (one-sided error). It is based on HEKTO's idea, using the equivalent relation.

The algorithm chooses weight $w_a$ for each arc $a$. Then, the algorithm computes the weighted sum of paths $W(a) = \sum_{p \in P(a)} \prod_{a' \in p} w_{a'}$ for each each arc $a$. All $W$ values can be computed by using two dynamic programming (combining "forward" DP and "backward" DP) and in $\Theta(n + m)$ arithmetic operations. The algorithm then assign one color for each $W$ value, using a hash map.

Pseudo code:

forward = [ 1 if it is the source, 0 otherwise | vertices ]
for each arc a in topological order:
    forward[a.to] += forward[a.from] * w[a]

backward = [ 1 if it is the destination, 0 otherwise | vertices ]
for each arc a in reverse topological order:
    backward[a.from] += w[a] * backward[a.to]

for each arc a:
    W[a] = forward[a.from] * w[a] * backward[a.to]

It is easy to see $W(a) = W(b)$ if and only if $P(a) = P(b)$ if weights $w$ are treated as formal variables. According to Schwartz–Zippel lemma, if we choose weight randomly from a finite field $F$, then one particular equality fails in at most probability $m/|F|$. The overall success probability of the algorithm can be bounded by $1 - m^3 / 2|F|$ because we have at most $m \choose 2$ equations we want to distinguish, but it should be more like $\approx 1 - m^2/|F|$ for "typical input" (though I'm not very sure). We can choose a large prime of size $p \approx 2^{64}$ and do a modular arithmetic $F = GF(p)$ to implement the algorithm.

许可以下： CC-BY-SA 和归因

不隶属于 cs.stackexchange