how to reduce the time complexity?

https://cs.stackexchange.com/questions/128105

29-09-2020
|

Question

I have a graph G=(V+E) and list of list Node where each sublist is subset of V. I want to find out nodes of each sublist that are not reachable from other nodes of that sublist and add edge between those nodes. Suppose,Node=[[1,2,3,4,5,7,8],[9,10],....] and in sublist=[1,2,3,4,5,7,8], {1,7,3,8} reachable from each other and {2,4,5} reachable from each other so an edge needs to be added between 1/7/3/8 and 2/4/5. I have used the following :

for each sublist in Node
 SG=node_induced_subgraph(sublist) of G
 C=connected_components(SG) 
 for each component c_i in C
  add edge between c_i and c_{i+1}
 end
end

complexity of node_induced_subgraph() is O(V+E) complexity of connected_components() is O(m+n) # m is no. of nodes and n is no. of edges in Sub-graph

So how to reduce the overall complexity?

Solution

I'm gonna address your question in two parts. First how to ensure reachability between nodes in one "sublist" with adding a minimal amount of edges (if you can add as many edges as you wanted you could go through the list and connect one element to its two neighboring elements - costing $O(n)$ edge addition operations, which is theoretically the fastest method...). If we want to add as few edges as possible we only want to connect the "connected components". I suppose your graph $G$ is stored as an adjacency list.

//reused arrays
visited := bool array of size |V| initialized to false
active  := bool array of size |V| initialized to false
//simple dfs
def dfs(node):
  if visited[node] or not active[node]: return
  visited[node] = true
  for child in neighbors(node):
    dfs(child)
//algo
def connect_node_list(list):
  for node in list: active[node] = true
  last := null
  for node in list:
    if not visited[node]: 
      dfs(node)
      if last != null: add_edge(last,node)
      last = node
  for node in list: 
    active[node]  = false
    visited[node] = false

Now this algo has a runtime of $O(n+|E|)$ where $n$ is the length of list. Well to be honest it's the same as your algo I just want to highlight that the usage of bool arrays is very desirable in general; also I don't know how your "indcued subgraph" method works - but you should avoid building a new graph.

Now an idea that might be new: you can reduce the number of sublists by connecting those that have an overlapping element. (Of course this is only useful if they aren't necessarily distinct.) You can use a union-find-structure for that:

uf := union find structure with |V| elements
for list in sublists:
  first = list[0]
  for node in list:
    uf.unify(first, node)
index := array of length |V|
counter := 0
for element in uf:
  index[element]  = -1
  if uf.parent(element) = element and uf.size(element) > 1:
      index[element] = counter
      counter++
new_sublists = list of <counter> many lists
for element in uf:
  set = uf.find(element)
  if index[set] != -1:
    new_sublists[index[set]].add(element)

Now ´new_sublists´ only contains the necessary amount of sublists (also all sublists of size 1 are removed). The procedure takes $O(max(|V|,N\cdot\alpha (|V|)))$ where $N=\sum_i \text{|sublists[i]|}$. $\alpha(|V|)$ can basically be treated as a constant but you can read up on the union-find-structure article. If you are working with many overlapping node lists this algo should be executed first.

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange