How to find a superstar in linear time?

https://cs.stackexchange.com/questions/411

16-10-2019
|

Pergunta

Consider directed graphs. We call a node $v$ superstar if and only if no other node can be reached from it, but all other nodes have an edge to $v$. Formally:

$\qquad \displaystyle $v$ \text{ superstar } :\Longleftrightarrow \mathrm{outdeg}(v) = 0 \land \mathrm{indeg}(v) = n-1$

with $n$ the number of nodes in the graph. For example, in the below graph, the unfilled node is a superstar (and the other nodes are not).

A Superstar
^[source]

How can you identify all superstars in a directed graphs in $\mathcal{O}(n)$ time? A suitable graph representation can be chosen from the usual candidates; please refrain from using representations that move the problem's complexity to preprocessing.

No assumptions regarding density can be made. We don't assume the graph contains a superstar; if there is none, the algorithm should recognize it.

Notation: $\mathrm{outdeg}$ is a node's number of outgoing edges, $\mathrm{indeg}$ similar for incoming edges.

Solução

We can eliminate all but one of the vertices by checking for the existence of $n-1$ edges because we can eliminate one possibility for each edge we check. In particular, if there is an edge going from $x$ to $y$, we eliminate $x$ and move on to $y$ (as another vertex can be reached from it); if not, we eliminate $y$ (as it cannot be reached from $x$). Once we reach the last vertex, whichever vertex is not eliminated should be compared with each other vertex (ensure the superstar condition is upheld: there is an edge incoming but not outgoing) until it is eliminated or confirmed as the superstar. Some pseudocode:

vertex superstar(graph g)
    current vertex = first
    # Go through each vertex
    for each subsequent vertex in g ("next")
        # If there's an edge from this to the next, we eliminate this one [move to the new one].
        # If not, we just stay here.
        if edge exists from current to next
            candidate = next
        end if
    end for
    # Now we are on the final remaining candidate, check whether it satisfies the requirements.
    # just a rename for clarity
    candidate = current
    for each other vertex in g
        if edge from current to other exists
            return null 
        else if no edge from other to current
            return null
        end if
    end for
    return candidate
end superstar

Let's walk through an example to illustrate the method. Take this array, with the source vertex on the top and destination on the side. 1 indicates an edge:

$$ \begin{array}{r|c c c c|r} & 1 & 2 & 3 & 4 \\ \hline 1 & - & 1 & 0 & 1 \\ 2 & 1 & - & 0 & 1 \\ 3 & 1 & 1 & - & 1 \\ 4 & 1 & 1 & 0 & - \\ \end{array} $$

I'll grey out the vertices we have ruled out as potential superstars. I'll use green and red to indicate the edges we are looking at when they do and do not contain the edge we're looking for, and blue to indicate where we have already looked.

We start by looking at vertices 1 and 2.

$$ \begin{array}{r|c c c c} & 1 & 2 & 3 & 4 \\ \hline 1 & - & \color{green}{1} & 0 & 1 \\ 2 & 1 & - & 0 & 1 \\ 3 & 1 & 1 & - & 1 \\ 4 & 1 & 1 & 0 & - \\ \end{array} $$ The green number shows there is an edge from 2 to 1, so we eliminate 2 and look for an edge from 3 to 1:

$$ \begin{array}{r|c c c c} & 1 & \color{grey}{2} & 3 & 4 \\ \hline 1 & - & \color{blue}{1} & \color{red}{0} & 1 \\ \color{grey}{2} & 1 & - & 0 & 1 \\ 3 & 1 & 1 & - & 1 \\ 4 & 1 & 1 & 0 & - \\ \end{array} $$

We see there is no such edge, so we eliminate 1 and take 3 as our current vertex. Recall that we have eliminated 2 already, so see whether there is an edge from 4 to 3:

$$ \begin{array}{r|c c c c} & \color{grey}{1} & \color{grey}{2} & 3 & 4 \\ \hline \color{grey}{1} & - & \color{blue}{1} & \color{blue}{0} & 1 \\ \color{grey}{2} & 1 & - & 0 & 1 \\ 3 & 1 & 1 & - & \color{green}{1} \\ 4 & 1 & 1 & 0 & - \\ \end{array} $$

There is an edge from 4 to 3, so we eliminate 4. At this point we have eliminated all but one of the vertices (3), so check out its edges and see whether it qualifies:

$$ \begin{array}{r|c c c c} & \color{grey}{1} & \color{grey}{2} & 3 & \color{grey}{4} \\ \hline \color{grey}{1} & - & \color{blue}{1} & \color{red}{0} & 1 \\ \color{grey}{2} & 1 & - & 0 & 1 \\ 3 & \color{green}{1} & 1 & - & \color{blue}{1} \\ \color{grey}{4} & 1 & 1 & 0 & - \\ \end{array} $$

There is an edge from 1 to 3 but not the reverse, so 3 is still a candidate.

$$ \begin{array}{r|c c c c} & \color{grey}{1} & \color{grey}{2} & 3 & \color{grey}{4} \\ \hline \color{grey}{1} & - & \color{blue}{1} & \color{blue}{0} & 1 \\ \color{grey}{2} & 1 & - & \color{red}{0} & 1 \\ 3 & \color{blue}{1} & \color{green}{1} & - & \color{blue}{1} \\ \color{grey}{4} & 1 & 1 & 0 & - \\ \end{array} $$

There is also an edge from 2 to 3 but not the reverse, so 3 is still a candidate.

$$ \begin{array}{r|c c c c} & \color{grey}{1} & \color{grey}{2} & 3 & \color{grey}{4} \\ \hline \color{grey}{1} & - & \color{blue}{1} & \color{blue}{0} & 1 \\ \color{grey}{2} & 1 & - & \color{blue}{0} & 1 \\ 3 & \color{blue}{1} & \color{blue}{1} & - & \color{green}{1} \\ \color{grey}{4} & 1 & 1 & \color{red}{0} & - \\ \end{array} $$

There is an edge from 4 to 3 but not 3 to 4; that completes our check of 3's edges and we have found that it is, in fact, a superstar.

Since we eliminate one vertex as a possible superstar on each of the first $n-1$ edge checks, the best case is that the $n$th check eliminates the final vertex for a complexity of $n$. In the worst case (the last or second-to-last vertex is a superstar, or the final check disqualifies it), after the first $n-1$ comparisons we compare the candidate with $2\times(n-1)$ more vertices, for a worst case complexity of $3n-3$ ($O(n)$). So, this algorithm is $\Theta(n)$.

Outras dicas

Isn't this the Celebrity Problem?

There will only be one superstar(celebrity) if there is one.

We use the adjacency matrix representation, where $A[i,j] = 1$ if there is a directed edge from $i$ to $j$, otherwise it is $0$. (I am guessing that is allowed).

By looking at $A[i,j]$ (can be done in $\mathcal{O}(1)$ time) we can eliminate at least one of them as a candidate for being the celebrity: If $A[i,j] = 1$, then you can eliminate $i$. If $A[i,j] = 0$ we can eliminate $j$.

Maintain a list of current candidates, eliminating one by one. A linked list should suffice.

At the end, you can verify if your candidate is indeed a superstar.

This algorithm will be $\mathcal{O}(n)$.

This answer addresses the version of the question where any graph representation was possible, not the current version of the question.

Store your graph as a pair of adjacency list and reverse adjacency list, where each list contains in addition the length of the list, hence the number out- and in-edges, respectively.
Preprocessing: compute reverse adjacency list (that is, the list of in-edges). Cost $\mathcal{O}(|E|)$.
Traverse the lists and select any node where the number of out-edges is $0$ and the number of in-edges is $n-1$. Cost $\mathcal{O}(|N|)$.

Just for reference, this is pseudocode of a recursive version of what Kevin posted.

superstar(V, E) {
  if ( |V| == 1 ) {
    return V.pop
  }

  a = V.pop
  b = V.pop
  if ( (a,b) ∈ E ) {
    no_ss = a
    keep  = b
  }
  else {
    no_ss = b
    keep = a
  }

  s = superstar(V ++ keep)

  return ( s != null && (no_ss, s) ∈ E && !(s, no_ss) ∈ E ) ? s : null
}

hasSuperstar(V, E) = superstar(V, E) != null

Licenciado em: CC-BY-SA com atribuição

Não afiliado a cs.stackexchange