indexOf method of String class in jdk is implemented by using BF, why not use KMP or BM? [closed]

StackOverflow https://stackoverflow.com/questions/23146845

  •  05-07-2023
  •  | 
  •  

Question

indexOf method of String class in jdk is implemented by using BF, why not use KMP or BM? Below is how jdk implement this funcion. It use BF to solve. why not use a more efficientive method, such as KMP, BM ?

   static int indexOf(char[] source, int sourceOffset, int sourceCount,
                   char[] target, int targetOffset, int targetCount,
                   int fromIndex) {
        if (fromIndex >= sourceCount) {
            return (targetCount == 0 ? sourceCount : -1);
        }
        if (fromIndex < 0) {
            fromIndex = 0;
        }
        if (targetCount == 0) {
            return fromIndex;
        }

        char first  = target[targetOffset];
        int max = sourceOffset + (sourceCount - targetCount);

        for (int i = sourceOffset + fromIndex; i <= max; i++) {
            /* Look for first character. */
            if (source[i] != first) {
                 while (++i <= max && source[i] != first);
            }

            /* Found first character, now look at the rest of v2 */
            if (i <= max) {
                int j = i + 1;
                int end = j + targetCount - 1;
                for (int k = targetOffset + 1; j < end && source[j] ==
                     target[k]; j++, k++);

                if (j == end) {
                    /* Found whole string. */
                    return i - sourceOffset;
                }
            }
        }
        return -1;
   }
Was it helpful?

Solution

why not use a more efficientive method, such as KMP, BM ?

The more advanced string search algorithms have a non-trivial setup time. If you are doing a once-off string search involving a not-too-large target string, you will find that you spend more time on the setup than you save during the string search.

And even just testing the lengths of the target and search string is not going to give a good answer as to whether it is "worth it" to use an advanced algorithm. The actual speedup you get from (say) Boyer-Moore depends on the values of the strings; i.e. the character patterns.

The Java implementors have take the pragmatic approach. They cannot guarantee that an advanced algorithm will give better performance, either on average, or for specific inputs. Therefore they have left it to the programmer to deal with ... where necessary.


FWIW, I'm not aware of any other mainstream programming language that uses BM, etc in their runtime library's standard "string find" functionality.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top