There is a fast algorithm that uses the prefix function. A prefix function of a string s is an array p, where p[i] is the longest length of the prefix of a substring s[0..i] (0-indexed) and its suffix. It can be calculated with O(n) complexity using KMP that uses 2 facts:
- p[i+1]<=p[i]+1.
- For each i, if s[p[i]]==s[i+1], then p[i+1] = p[i] + 1. Otherwise, we should try another string, for which s[0...j-1]==s[i-j+1...i]. Obviously, (we choose the longest string) we should just jump to the position i = p[i-1].
The algorithm (c++):
vector<int> prefix (string s)
{
int n=s.length();
vector<int> pi(n);
pi[0]=0;
for (int i=1; i<n; ++i)
{
int j = pi[i-1];
while (j>0 && s[i]!=s[j])
j=pi[j-1];
if (s[i]==s[j]) ++j;
pi[i]=j;
}
return pi;
}
Now we can construct the suffix table:
m = text.length();
vector<int> suffshift(m);
vector<int> pi = prefix(pattern);
vector<int> pi1 = prefix(inverse(pattern));
for (int j=0; j<m; ++j)
{
suffshift[j] = m - pi[m];
}
for (int i=1; i<m; ++i)
{
j = m - pi1[i];
suffshift[j]=min(suffshift[j], i-pi1[i]);
}
Suffshift[m] stands for the empty suffix, suppshift[0] - for the whole text. THe complexity is O(n).