سؤال

Suppose I have a pattern P and some text T, and I want to find the largest prefix of P that matches a substring of T. Is it possible to modify the KMP algorithm to do such an operation? (If I remember correctly, the KMP algorithm does partial matches, but I am interested in the longest possible match).

هل كانت مفيدة؟

المحلول

As KMP is scanning the text, the state of the KMP shows the length of the longest prefix of the pattern that matches the text up to the current point, so you could record the maximum length seen and the point in the pattern at which it was seen, and it does look like you could use this to find a longest matching prefix of P.

Another way of doing this would be to put all prefixes of P into Aho-Corasick. The run-time behaviour would be very similar, although it would consume a little more store. It would allow you to use an existing library - if you had one for Aho-Corasick, instead of modifying a KMP implementation.

نصائح أخرى

Actually it is a typical scenario of the so called "extended-KMP".

See the sample code here and here.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top