Question

I've written a small, quick (to write, not to run!), algorithm for string matching in Python:

def bruteMatch(n,m):
    for i in range(0,len(n)-len(m)+1):
        if n[i:len(m)+i]==m:
            return(i)

Would the runtime for this algorithm be O(nm)? I'm comparing it to the worst-case runtime for the Horspool string-matching algorithm, which is also (nm). I guess my confusion stems from the fact that my algorithm appears initially as O(n) runtime as it simply iterates through the input n, using the indices as inputs to a slice notation equality statement? Thoughts?

Was it helpful?

Solution

Worst case is O(nm), yes. Because, you see, the == test tests each character in each string, which could take as long as the length of the shortest string in the equality test.

OTHER TIPS

This does indeed take O(n*m) time. You run the loop (n-m) times, and the string comparison itself takes (min(n,m)) time. This is fine while either n or m is very small, but consider a worst case where m=n/2:

The loop is executed (n-n/2) times, and the comparison takes (n/2) time, a total of (O(n^2)) time - not so great!

If performance is important, and the search string is large, consider using a hash-based algorithm such as Rabin–Karp.

Hope this helps!

def bruteMatch(n,m):
    for i in range(0,len(n)-len(m)+1):
        if n[i:len(m)+i]==m:
            return(i)
  1. for i in range(0,len(n)-len(m)+1) will loop len(n)-len(m)+1 times

  2. if n[i:len(m)+i]==m compare all characters in n from i to len(m)+i with m

Example 1

n = "Hello"
m = "Jello"
len(n)-len(m)+1 = 4-4+1 = 1

range(0,1)

i=0
if n[0:4+0] == m ~ n[0:4] == m[:] ~ "Hello" == "Jello"
FALSE

Example 2

n = "Hi"
m = "Jello"
len(n)-len(m)+1 = 2-4+1 = -1

range(0,-1) ~ *LOGICAL ERROR*

Example 3

n = "Hello"
m = "Hi"
len(n)-len(m)+1 = 4-2+1 = 3

range(0,3)

i=0
if n[0:2+0] == m ~ n[0:2] == m[:] ~ "He" == "Hi"
FALSE

i=1
if n[1:2+1] == m ~ n[0:3] == m[:] ~ "Hel" == "Hi"
FALSE

i=2
if n[2:2+2] == m ~ n[0:4] == m[:] ~ "Hell" == "Hi"
FALSE

Conclusion

Your code is going to compare n[0:z] for z in range (0, len(n)+1) for len(n)-len(m)+1 times:

If len(n) == len(m) then check n == m
Else if len(n) > len(n) then return check m is in [ n[0:len(m)], .., n[0:len(n)] ]
Else if len(n) < len(m) then Error: Invalid range
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top