Question

Why is non-greedy match not working for me? Take following example:

public String nonGreedy(){
   String str2 = "abc|s:0:\"gef\";s:2:\"ced\"";
   return str2.split(":.*?ced")[0];
}

In my eyes the result should be: abc|s:0:\"gef\";s:2 but it is: abc|s

Was it helpful?

Solution

The .*? in your regex matches any character except \n (0 or more times, matching the least amount possible).

You can try the regular expression:

:[^:]*?ced

On another note, you should use a constant Pattern to avoid recompiling the expression every time, something like:

private static final Pattern REGEX_PATTERN = 
        Pattern.compile(":[^:]*?ced");

public static void main(String[] args) {
    String input = "abc|s:0:\"gef\";s:2:\"ced\"";
    System.out.println(java.util.Arrays.toString(
        REGEX_PATTERN.split(input)
    )); // prints "[abc|s:0:"gef";s:2, "]"
}

OTHER TIPS

It is behaving as expected. The non-greedy match will match as little as it has to, and with your input, the minimum characters to match is the first colon to the next ced.

You could try limiting the number of characters consumed. For example to limit the term to "up to 3 characters:

:.{0,3}ced

To make it split as close to ced as possible, use a negative look-ahead, with this regex:

:(?!.*:.*ced).*ced

This makes sure there isn't a closer colon to ced.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top