I think a little history will make it easier to understand. When the Larry Wall wanted to grow regex syntax to support new features, his options were severely limited. He couldn't just decree (for example) that %
is now a metacharacter that supports new feature "XYZ". That would break the millions of existing regexes that happened to use %
to match a literal percent sign.
What he could do is take an already-defined metacharacter and use it in such a way that its original function wouldn't make sense. For example, any regex that contained two quantifiers in a row would be invalid, so it was safe to say a ?
after another quantifier now turns it into a reluctant quantifier (a much better name than "lazy" IMO; non-greedy good too). So the answer to your question is that ?
doesn't modify the *
, *?
is a single entity: a reluctant quantifier. The same is true of the +
in possessive quantifiers (*+
, {0,2}+
etc.).
A similar process occurred with group syntax. It would never make sense to have a quantifier after an unescaped opening parenthesis, so it was safe to say (?
now marks the beginning of a special group construct. But the question mark alone would only support one new feature, so the ?
itself to be followed has to be followed by at least one more character to indicate which kind of group it is ((?:...)
, (?<!...)
, etc.). Again, the (?:
is a single entity: the opening delimiter of a non-capturing group.
I don't know offhand why he used the question mark both times. I do know Perl 6 Rules (a bottom-up rewrite of Perl 5 regexes) has done away with all that crap and uses an infinitely more sensible syntax.