Question

In .NET, if I want to match a sequence of characters against a pattern that describes capturing groups that occur any number of times, I could write something as follows:

String input = "a, bc, def, hijk";
String pattern = "(?<x>[^,]*)(,\\s*(?<y>[^,]*))*";

Match m = Regex.Match(input, pattern);
Console.WriteLine(m.Groups["x"].Value);

//the group "y" occurs 0 or more times per match
foreach (Capture c in m.Groups["y"].Captures)
{
    Console.WriteLine(c.Value);
}

This code would print:

a
bc
def
hijk

That seems straightforward, but unfortunately the following Java code doesn't do what the .NET code does. (Which is expected, since java.util.regex doesn't seem to distinguish between groups and captures.)

String input = "a, bc, def, hijk";
Pattern pattern = Pattern.compile("(?<x>[^,]*)(,\\s*(?<y>[^,]*))*");

Matcher m = pattern.matcher(input);

while(m.find())
{
     System.out.println(m.group("x"));
     System.out.println(m.group("y"));
}

Prints:

a
hijk

null

Can someone please explain how to accomplish the same using Java, without having to re-write the regular expression or use external libraries?

Was it helpful?

Solution

What you want is not possible in java. When the same group has been matched several times, only the last occurrence of that group is saved. For more info read the Pattern docs section Groups and capturing. In java the Matcher/Pattern is used to iterate through a String in "real-time".

Example with repetition:

String input = "a1b2c3";
Pattern pattern = Pattern.compile("(?<x>.\\d)*");
Matcher matcher = pattern.matcher(input);
while(matcher.find())
{
     System.out.println(matcher.group("x"));
}

Prints (null because the * matches the empty string too):

c3
null

Without:

String input = "a1b2c3";
Pattern pattern = Pattern.compile("(?<x>.\\d)");
Matcher matcher = pattern.matcher(input);
while(matcher.find())
{
     System.out.println(matcher.group("x"));
}

Prints:

a1
b2
c3

OTHER TIPS

You can use Pattern and Matcher classes in Java. It's slightly different. For example following code:

Pattern p = Pattern.compile("(el).*(wo)");
Matcher m = p.matcher("hello world");
while(m.find()) {
  for(int i=1; i<=m.groupCount(); ++i) System.out.println(m.group(i));
}

Will print two strings:

el
wo
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top