سؤال

Suppose I have a string that looks like:

"lets refer to [[merp] [that entry called merp]] and maybe also to that entry called [[blue] [blue]]"

The idea here is to replace a block of [[name][some text]] with <a href="name.html">some text</a>.

So I'm trying to use regular expressions to find blocks that look like [[name][some text]], but I'm having tremendous difficulty.

Here's what I thought should work (in PHP): preg_match_all('/\[\[.*\]\[.*\]/', $my_big_string, $matches)

But this just returns a single match, the string from '[[merp' to 'blue]]'. How can I get it to return the two matches [[merp][that entry called merp]] and [[blue][blue]]?

هل كانت مفيدة؟

المحلول 2

Quantifiers like the * are by default greedy,

which means, that as much as possible is matched to meet conditions. E.g. in your sample a regex like \[.*\] would match everything from the first [ to the last ] in the string. To change the default behaviour and make quantifiers lazy (ungreedy, reluctant):

  • Use the U (PCRE_UNGREEDY) modifier to make all quantifiers lazy
  • Put a ? after a specific quantifier. E.g. .*? as few of any characters as possible

1.) Using the U-modifier a pattern could look like:

/\[\[(.*)]\s*\[(.*)]]/Us

Additional used the s (PCRE_DOTALL) modifier to make the . dot also match newlines. And added some \s whitespaces in between ][ which are in your sample string. \s is a shorthand for [ \t\r\n\f].

There are two capturing groups (.*) to be replaced then. Test on regex101.com


2.) Instead using the ? to making each quantifier lazy:

/\[\[(.*?)]\s*\[(.*?)]]/s

Test on regex101.com


3.) Alternative without modifiers, if no square brackets are expected to be inside [...].

/\[\[([^]]*)]\s*\[([^]]*)]]/

Using a ^ negated character class to allow [^]]* any amount of characters, that are NOT ] in between [ and ]. This wouldn't require to rely on greediness. Also no . is used, so no s-modifier is needed.

Test on regex101.com


Replacement for all 3 examples according to your sample: <a href="\1">\2</a> where \1 correspond matches of the first parenthesized group,...

نصائح أخرى

The regex you're looking for is \[\[(.+?)\]\s\[(.+?)\]\] and replace it with <a href="$1">$2</a>

The regex pattern matched inside the () braces are captured and can be back-referenced using $1, $2,...

Example on regex101.com

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top