Question

Take the following code:

$target = 'NAME FUNC LPAREN P COMMA P COMMA P RPAREN';
//$target = 'NAME FUNC LPAREN P RPAREN';
//$target = 'NAME FUNC LPAREN RPAREN';
$pattern = '/(?P<ruleName>NAME )?(?P<funcName>FUNC )?(?:(?<=LPAREN)(?: (?P<arg1>P))|(?P<args>P)(?=(?: RPAREN)|(?: COMMA)))/';

preg_match_all($pattern,$target,$matches,PREG_OFFSET_CAPTURE|PREG_PATTERN_ORDER);

I need to get the position of NAME, FUNC and each P within the $target (thus PREG_OFFSET_CAPTURE). The pattern works for Ps, but it doesn't match either of the named groups "ruleName" or "funcName".

What am I missing?

Thanks.

Was it helpful?

Solution

I think I've found the reason.

  1. Your named backreferences are optional.
  2. If they match (and on the first try they do), then the regex engine is standing to the left of "LPAREN".
  3. The next token the regex engine tries to match is a space character. This is because the lookbehind expression (?<=LPAREN) does not consume characters in the string.
  4. It can't match the space because there's an L
  5. The regex engine discards the optional matches from 2. and goes on until it finds the next space.
  6. It matches and keeps matching from then on, capturing all the Ps. But the named groups had to be given up for this to work.

I'm not sure why you need the lookbehind. How about

/(?P<ruleName>NAME )?(?P<funcName>FUNC )?(?:LPAREN )(?:(?P<arg1>P))|(?P<args>P)(?=(?: RPAREN)|(?: COMMA))/
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top