Shortest uncommon substring: shortest substring of one string, that is not a substring of another string

StackOverflow https://stackoverflow.com/questions/12607512

  •  04-07-2021
  •  | 
  •  

Question

We need to find shortest uncommon substring between two strings i.e. if we have two strings a and b so we need to find the length of shortest substring of a that is not a substring of b.

How to solve this problem using suffix array ?

To be solved with complexity of not more than n*lg(n)

Était-ce utile?

La solution

This may be solved in O(N) time with Generalized suffix tree.

After constructing the generalized suffix tree in O(N) time, you need to perform breadth-first search and find the first node not belonging to both strings. The path from the root to this node gives the shortest uncommon substring.


The same thing may be done using the generalized suffix array for two input strings, also in O(N) time.

Construct the generalized suffix array along with LCP array (or construct the LCP array later from the suffix array). Add a single zero element as a prefix of the LCP array; add another zero element as a suffix. Find a pair of minimal LCP entries in such a way that there are suffixes of only one string delimited by these entries. This means you need to perform a linear scan of the LCP array, extracting two minimal values, but reset both minimal values to infinity every time you see a suffix of a different string or if you see a suffix belonging to both strings. The larger element of the best of these pairs (having the least value for the larger element in the pair) gives the length of the shortest uncommon substring. This works because this pair of minimal values delimits all descendants of the first node (closest to the root), not belonging to both strings in the corresponding suffix tree.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top