Can I use an OR in regex without capturing what's enclosed?
-
06-10-2020 - |
Domanda
I'm using rubular.com to build my regex, and their documentation describes the following:
(...) Capture everything enclosed
(a|b) a or b
How can I use an OR expression without capturing what's in it? For example, say I want to capture either "ac" or "bc". I can't use the regex
(a|b)(c)
right? Since then I capture either "a" or "b" in one group and "c" in another, not the same. I know I can filter through the captured results, but that seems like more work...
Am I missing something obvious? I'm using this in Java, if that is pertinent.
Soluzione
Depending on the regular expression implementation you can use so called non-capturing groups with the syntax (?:…)
:
((?:a|b)c)
Here (?:a|b)
is a group but you cannot reference its match. So you can only reference the match of ((?:a|b)c)
that is either ac
or bc
.
Altri suggerimenti
If your implementation has it, then you can use non-capturing parentheses:
(?:a|b)
Even rubular doesn't make you use parentheses and the precedence of |
is low. For example a|bc does not match ccc
If your OR alternatives are all single characters - you can just use "character set" operator:
([ab]c)
it will only match ac
or bc
and it's more readable.