Quantifiers like the *
are by default greedy,
which means, that as much as possible is matched to meet conditions. E.g. in your sample a regex like \[.*\]
would match everything from the first [
to the last ]
in the string. To change the default behaviour and make quantifiers lazy (ungreedy, reluctant):
- Use the
U (PCRE_UNGREEDY)
modifier to make all quantifiers lazy - Put a
?
after a specific quantifier. E.g..*?
as few of any characters as possible
1.) Using the U-modifier a pattern could look like:
/\[\[(.*)]\s*\[(.*)]]/Us
Additional used the s (PCRE_DOTALL) modifier to make the .
dot also match newlines. And added some \s
whitespaces in between ][
which are in your sample string. \s
is a shorthand for [ \t\r\n\f]
.
There are two capturing groups (.*)
to be replaced then. Test on regex101.com
2.) Instead using the ?
to making each quantifier lazy:
/\[\[(.*?)]\s*\[(.*?)]]/s
3.) Alternative without modifiers, if no square brackets are expected to be inside [...]
.
/\[\[([^]]*)]\s*\[([^]]*)]]/
Using a ^
negated character class to allow [^]]*
any amount of characters, that are NOT ]
in between [
and ]
. This wouldn't require to rely on greediness. Also no .
is used, so no s-modifier is needed.
Replacement for all 3 examples according to your sample: <a href="\1">\2</a>
where \1
correspond matches of the first parenthesized group,...