KMP string matching is correct if the i'th entry of the failure table is the length of the longest suffix-prefix match of the prefix of the pattern of length i.
Notice that if A[0..k]
is a suffix-prefix match of A[0..i]
, then A[0..k]
is either the longest suffix-prefix match of A[0..i]
or it's a suffix-prefix match of said longest suffix-prefix match.
When you put these two things together, it turns out that we want failure[i]
to be the length of the longest suffix-prefix match of pattern[0..i]
. KMP preprocessing builds failure
inductively, using the following fact:
If the longest suffix-prefix match of A[0..i]
is nonempty, chopping off the last character will give some suffix-prefix match of A[0..i-1]
.
Consequently, the longest suffix-prefix match of A[0..i]
is either empty or it's formed by extending the longest suffix-prefix match of A[0..i-1]
by one character or by extending a suffix-prefix match of that longest suffix-prefix match by one character. The failure function for the preceding characters gives you a straightforward way to iterate over all suffix-prefix matches of pattern[0..i-1]
.