Question

I have tried going through the theory in the paper http://webglimpse.net/pubs/suffix.pdf

But I am kind of lost when they say

Let Ai be the first suffix in the first bucket (i.e., Pos[0] = i), and consider Ai-h (if i-h < 0, then we ignore Ai and take the suffix of Pos[1], and so on). Since Ai starts with the smallest h-symbol string, Ai-h should be the first in its 2h-bucket.

I am not able to understand this statement. Why Ai-h can be ignored if i-h < 0. How the position is getting determined in const time when i-h > 0 in the phase 1?

One sample impl is http://belbesy.wordpress.com/2012/10/10/spoj-649-distinct-substrings-suffix-arrays-nlgn/

Was it helpful?

Solution

I strongly recommend that, instead of trying to understand the C++ code, walk through this Python implementation of the Manbers-Myers suffix array construction algorithm , by hand, for a simple 5 character example.

Because the Python version is only about 15 lines of code, so it's pretty easy to follow.

Even if you don't understand Python, treat it as pseudocode and Google the syntax you don't understand.

Personally, I walked through one 5 character string by hand, and it was enough to help me to understand how the algorithm worked..

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top