سؤال

Well I dont think the title is right but here is the problem.

I am working on a tabs shortcode and have this regex which works fine

http://regex101.com/r/fR0iV7

    [tabs id="myid" type="tabnav"]
    [tabsgroup title="Tab title" active="1"]Tab content goes here...[/tabsgroup]
    [tabsgroup title="Tab title" active="0"]Tab content goes here...[/tabsgroup]
    [tabsgroup title="Tab title" active="0"]Tab content goes here...[/tabsgroup]
    [/tabs]

$re = '/\[tabsgroup .*?title\=\"(.*?)\".*?active\=\"(.*?)\"](.*?)\[\/tabsgroup \]/s';


// print

MATCH 1
1.  [49-58] `Tab title`
2.  [68-69] `1`
3.  [71-95] `Tab content goes here...`
MATCH 2
1.  [126-135]   `Tab title`
2.  [145-146]   `0`
3.  [148-172]   `Tab content goes here...`
MATCH 3
1.  [203-212]   `Tab title`
2.  [222-223]   `0`
3.  [225-249]   `Tab content goes here...`

until I try to use an icon inside the title.

[tabsgroup title="Tab title  <span class="fa fa-star"></span>" active="1"]Tab content goes here...[/tabsgroup]

the match comes out partially as http://regex101.com/r/xM8cW8

MATCH 1
1.  [49-72] `Tab title  <span class=`
2.  [102-103]   `1`
3.  [105-129]   `Tab content goes here...`

I know that this is because I am capturing this title\=\"(.*?)\" but I am not sure how to refine it so that it can distinguish between its own quotes and the quotes inside the quotes. Any help is appreciated.

هل كانت مفيدة؟

المحلول

You can quickly fix the problem by skipping all parts of attribute values that are between angle brackets:

$pattern = '~
    \[tabsgroup .*? 
        title="( (?> <[^>]+> | [^"] )* )" .*?
        active="(.*?)"
     ]
     (.*?)
    \[/tabsgroup]‌​ ~xs';

However, keep in mind that using .* or .*? is dangerous and can give you unexpected results, since these subpatterns can match absolutely all. I don't know what are all the possible attributes of your shortcodes, but I suggest to change .*? by something more explicit. Example:

$pattern = <<<'EOD'
~
    \[tabsgroup
        \s+ # if you know that it can only be white characters 
            # otherwise use \s[^]]*?
        title="( (?> <[^>]+> | [^"] )* )"
        \s+ # the same here
        active="([^"]*)"  # no quotes allowed here
        # or you can reuse the title subpattern: active="((?1))"
        # however, if only possible values are 0 and 1: active="([01])"
     ]
     (.*?) # This can be change to ([^[]+) if you are sure to not have 
           # other tags inside.
           # If it is not the case: ( (?>[^[]+|\[(?!/tabsgroup]))* )
           # will match all except [/tabsgroup]
    \[/tabsgroup]‌
~xs
EOD;
// Once there is no more dot, don't forget to remove the s modifier
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top