Question

This is a tricky one, I have a string:

This is some text with a {%TAG IN IT%} and some more text then {%ANOTHER TAG%} with some more text at the end.

I have a regex to match the tags:

({%\w+[\w =!:;,\.\$%"'#\?\-\+\{}]*%})

Which will match a starting tag with any alphanumeric character followed by any number of other ansi characters (sample set specified in the regex above).

However (in PHP using "preg_match_all" and "preg_split" at least) the fact that the set contains both the percent (%) and the curly braces ({}) means that the regex matches too much if there are two tags on the same line.

e.g, in the example given, the following is matched:

{%TAG IN IT%} and some more text then {%ANOTHER TAG%}

As you can see, the %}...{% were matched. So, what I need is to allow the "%" but NOT when followed by "}"

I've tried non-reedy matching, and negative lookahead, but the negative lookahead won't work in a character set (i.e. everything in the [\w...]* set).

I'm stuck!

Was it helpful?

Solution 2

A slight modification of your regexp works(Just add the question mark to make it non-greedy)-

<?php
    $input = "This is some text with a {%TAG % }IT%%} and some more text then {%ANOTHER TAG%} with some more text at the end.";
    $regexp = "/{%\w+[\w =!:;,\.\$%\"'#\?\-\+\{}]*?%}/";
    //                                            ^ Notice this
    if(preg_match_all($regexp, $input, $matches, PREG_SET_ORDER)) {
        foreach($matches as $match) {
            var_dump($match);
            echo "\r\n";
        }
        unset($match);
    }
    /*
        Outputs:
        array
          0 => string '{%TAG % }IT%%}' (length=14)
        array
          0 => string '{%ANOTHER TAG%}' (length=15)
    */
?>

OTHER TIPS

You could use alternation to achieve this:

/\{%(?:[^%]|%(?!}))*%\}/

It matches either characters that aren't % or those that aren't followed by } (using a look-ahead assertion).

$str = 'This is some text with a {%tag with % and } inside%} and some more text then {%ANOTHER TAG%} with some more text at the end.';

$pattern = '/\{%(?:[^%]|%(?!}))*%\}/';

preg_match_all($pattern, $str, $matches);
print_r($matches[0]);

Output:

Array
(
    [0] => {%tag with % and } inside%}
    [1] => {%ANOTHER TAG%}
)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top