Question

I failed the whole evening to calculate a simple shift table for the search term "anabanana" for use in the Boyer and Moore pattern matching algorithm.

I found the following example without any explanations: enter image description here

Can anybody help me understand and explain the image's method for finding the shift table?

Was it helpful?

Solution

I think I understand what is done here, so i'll try and explain.

The line "Wort" is the pattern you are analysing, there is (in my opinion) no need to consider the row "Text" above. Instead assume an additional row containing the zero-based position of the char within your pattern from left to right. The length m of the pattern p[] depicted is 9. Each row below I name p_i[] where i is the index on the right

Further explanation is based on 2:

In the lower rows beneath the pattern mark all characters matching the character in the pattern above. (Done here by crossing out)

for i=1 to m do
    search in the rows below for a subpattern where p[m-i]<>p_j[m-1] (*)
    and p[m-i+1, ..., m-1]=p_j[m-i+1, ..., m-1]
    index j is your shift value for shift[i]
od  

(*) Note: When the shifted pattern p_j got shifted too far right, there will be empty chars to compare. In this case, you can always assume == or <> as needed. Always use the minimum of all possible j.

example of assorted steps

I hope this helps, albeit a bit late.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top