Question

I am attempting to capture a series of numbers from a specific line in a paragraph of text using regex. In the simplified example below, I'm just trying to capture the 4 digit numbers in the "Active Phone Lines" section. I'm assuming that there is a unknown number of active phone lines, and the numbers cannot repeat themselves:

User Names: bob, jill, toni, tom
Active Phone Lines: 1010, 2020, 3030, 4040, 5050, 6060, 7070
Inactive Phone Lines: 1111, 2222, 3333, 4444, 5555

I know that I can split the string by carriage returns/newlines and just use a regular expression of ([0-9]{4}), but I got curious and want to see if I can just use one regular expression.

So far I was able to get all of what I want with the following regex:

(?<=Active Phone Lines: |, )([0-9]{4})(?=, |\rInactive Phone Lines:)

But this will capture 2222, 3333, and 4444 of the "Inactive Phone Lines". I know I can use back references to reference previously captured groups, but as far as I can tell I can only reference them by captured order and not just the previous capture. And it appears like it only works within the same expression and not through the multiple iterations of a search.

Is there a way to back reference the previous captured group? Assuming $foo would do so, I could then use the following regex:

(?<=Active Phone Lines: |$foo, )([0-9]{4})(?=$foo, |\rInactive Phone Lines:)
Was it helpful?

Solution

You can make use of a \G anchor like this:

(?:Active Phone Lines:|\\G)[\\s,]*([0-9]{4})

In:

Pattern pattern = Pattern.compile("(?:Active Phone Lines:|\\G)[\\s,]*([0-9]{4})");
String test = "User Names: bob, jill, toni, tom"+
              "Active Phone Lines: 1010, 2020, 3030, 4040, 5050, 6060, 7070"+
              "Inactive Phone Lines: 1111, 2222, 3333, 4444, 5555";
Matcher matcher = pattern.matcher(test);
while (matcher.find()) {
    System.out.println(matcher.group(1));
}

The \G matches at the end of the previous match (and at the start of the string, but that's not an issue here).

ideone demo

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top