سؤال

I have this text and I'm trying to remove all the inner quotes, just keeping one quoting level. The text inside a quote contains any characters, even line feeds, etc. Is this possible using a regex or I have to write a little parser?

[quote=foo]I really like the movie. [quote=bar]World 

War Z[/quote] It's amazing![/quote]
This is my comment.
[quote]Hello, World[/quote]
This is another comment.
[quote]Bye Bye Baby[/quote]

Here the text I want:

[quote=foo]I really like the movie.  It's amazing![/quote]
This is my comment.
[quote]Hello, World[/quote]
This is another comment.
[quote]Bye Bye Baby[/quote]

This is the regex I'm using in PHP:

%\[quote\s*(=[a-zA-Z0-9\-_]*)?\](.*)\[/quote\]%si

I tried also this variant, but it doesn't match . or , and I can't figure what else I can find inside a quote:

%\[quote\s*(=[a-zA-Z0-9\-_]*)?\]([\w\s]+)\[/quote\]%i

The problem is located here:

(.*)
هل كانت مفيدة؟

المحلول

You can use this:

$result = preg_replace('~\G(?!\A)(?>(\[quote\b[^]]*](?>[^[]+|\[(?!/?quote)|(?1))*\[/quote])|(?<!\[)(?>[^[]+|\[(?!/?quote))+\K)|\[quote\b[^]]*]\K~', '', $text);

details:

\G(?!\A)              # contiguous to a precedent match
(?>                   ## content inside "quote" tags at level 0
  (                    ## nested "quote" tags (group 1)
    \[quote\b[^]]*]
    (?>                ## content inside "quote" tags at any level
      [^[]+
     |                  # OR
      \[(?!/?quote)
     |                  # OR
      (?1)              # repeat the capture group 1 (recursive)
    )*
    \[/quote]
  )
 |
  (?<!\[)           # not preceded by an opening square bracket
  (?>              ## content that is not a quote tag
    [^[]+           # all that is not a [
   |                # OR
    \[(?!/?quote)   # a [ not followed by "quote" or "/quote"
  )+\K              # repeat 1 or more and reset the match
)
|                   # OR
\[quote\b[^]]*]\K   # "quote" tag at level 0 

نصائح أخرى

use this pattern

\[quote=?[^\]]*\][^\[]*\[/quote\](?=((.(?!\[q))*)\[/)

and replace with nothing like in this example

I think it would be easier to write a parser.

Use regex to find [quote] and [\quote], and then analyse the result.

preg_match_all('#(\[quote[^]]*\]|\[\/quote\])#', $bbcode, $matches, PREG_OFFSET_CAPTURE);
$nestlevel = 0;
$cutfrom = 0;
$cut = false;
$removed = 0
foreach($matches(0) as $quote){
    if (substr($quote[0], 0, 1) == '[') $nestlevel++; else $nestlevel--;
    if (!$cut && $nestlevel == 2){ // we reached the first nested quote, start remove here
        $cut = true;
        $cutfrom = $quote[1];
    }
    if ($cut && $nestlevel == 1){ // we closed the nested quote, stop remove here
        $cut = false;
        $bbcode = substr_replace($bbcode, '', $cutfrom - $removed, $quote[1] + 8 - $removed); // strlen('[\quote]') = 8
        $removed += $quote[1] + 8 - $cutfrom;
    }
);
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top