Question

Seemed like a simple problem, I need to extract a capturing group and optionally limit the group with a delimiting string.

In the below example, I provide a delimiting string of 'cd' and expect that it would return 'ab' in all of the cases: 'ab', 'abcd', and 'abcdefg'

Here is the code:

public static void main(String[] args) {
    String expected = "ab"; // Could be more or less than two characters
    String[] tests = {"ab", "abcd", "abcdefg"};
    Pattern pattern = Pattern.compile("(.*)cd?.*");

    for(String test : tests) {
        Matcher match = pattern.matcher(test);
        if(match.matches()) {
            if(expected.equals(match.group(1)))
                System.out.println("Capture Group for test: " + test + " - " + match.group(1));
            else System.err.println("Expected " + expected + " but captured " + match.group(1));
        } else System.err.println("No match for " + test);
    }
}

The output is:


    No match for ab
    Capture Group for test: abcd - ab
    Capture Group for test: abcdefg - ab

I thought that a lookahead might work, but I don't think that there is one that is optional (i.e. zero or more instances)

Was it helpful?

Solution

Try this:

Pattern pattern = Pattern.compile("(.*?)(?:cd.*|$)");

The .*? is non-greedy, and the rest of the regex either matches cd followed by anything, or the end of the string.

OTHER TIPS

I think your only problem might be that the ? is only applying to the d. Try (cd)? instead.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top