Question

I'm using rubular.com to build my regex, and their documentation describes the following:

(...)   Capture everything enclosed
(a|b)   a or b

How can I use an OR expression without capturing what's in it? For example, say I want to capture either "ac" or "bc". I can't use the regex

(a|b)(c)

right? Since then I capture either "a" or "b" in one group and "c" in another, not the same. I know I can filter through the captured results, but that seems like more work...

Am I missing something obvious? I'm using this in Java, if that is pertinent.

Was it helpful?

Solution

Depending on the regular expression implementation you can use so called non-capturing groups with the syntax (?:…):

((?:a|b)c)

Here (?:a|b) is a group but you cannot reference its match. So you can only reference the match of ((?:a|b)c) that is either ac or bc.

OTHER TIPS

If your implementation has it, then you can use non-capturing parentheses:

(?:a|b)

Even rubular doesn't make you use parentheses and the precedence of | is low. For example a|bc does not match ccc

If your OR alternatives are all single characters - you can just use "character set" operator:

([ab]c)

it will only match ac or bc and it's more readable.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top