문제

I have generated an string using the following alphabet. {A,C,G,T}. And my string contains more than 10000 characters. I'm searching the following patterns in it.

  • ATGGA
  • TGGAC
  • CCGT

I have asked to use a string matching algorithm which has O(m+n) running time.

m = pattern length
n = text length

Both KMP and Rabin-Karp algorithms have this running time. What is the most suitable algorithm (between Rabin-Carp and KMP) in this situation?

도움이 되었습니까?

해결책

When you want to search for multiple patterns, typically the correct choice is to use Aho-Corasick, which is somewhat a generalization of KMP. Now in your case you are only searching for 3 patterns so it may be the case that KMP is not that much slower(at most three times), but this is the general approach.

Rabin-Karp is easier to implement if we assume that a collision will never happen, but if the problem you have is a typical string searching KMP will be more stable no matter what input you have. However, Rabin-Karp has many other applications, where KMP is not an option.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top