Domanda

Well I dont think the title is right but here is the problem.

I am working on a tabs shortcode and have this regex which works fine

http://regex101.com/r/fR0iV7

    [tabs id="myid" type="tabnav"]
    [tabsgroup title="Tab title" active="1"]Tab content goes here...[/tabsgroup]
    [tabsgroup title="Tab title" active="0"]Tab content goes here...[/tabsgroup]
    [tabsgroup title="Tab title" active="0"]Tab content goes here...[/tabsgroup]
    [/tabs]

$re = '/\[tabsgroup .*?title\=\"(.*?)\".*?active\=\"(.*?)\"](.*?)\[\/tabsgroup \]/s';


// print

MATCH 1
1.  [49-58] `Tab title`
2.  [68-69] `1`
3.  [71-95] `Tab content goes here...`
MATCH 2
1.  [126-135]   `Tab title`
2.  [145-146]   `0`
3.  [148-172]   `Tab content goes here...`
MATCH 3
1.  [203-212]   `Tab title`
2.  [222-223]   `0`
3.  [225-249]   `Tab content goes here...`

until I try to use an icon inside the title.

[tabsgroup title="Tab title  <span class="fa fa-star"></span>" active="1"]Tab content goes here...[/tabsgroup]

the match comes out partially as http://regex101.com/r/xM8cW8

MATCH 1
1.  [49-72] `Tab title  <span class=`
2.  [102-103]   `1`
3.  [105-129]   `Tab content goes here...`

I know that this is because I am capturing this title\=\"(.*?)\" but I am not sure how to refine it so that it can distinguish between its own quotes and the quotes inside the quotes. Any help is appreciated.

È stato utile?

Soluzione

You can quickly fix the problem by skipping all parts of attribute values that are between angle brackets:

$pattern = '~
    \[tabsgroup .*? 
        title="( (?> <[^>]+> | [^"] )* )" .*?
        active="(.*?)"
     ]
     (.*?)
    \[/tabsgroup]‌​ ~xs';

However, keep in mind that using .* or .*? is dangerous and can give you unexpected results, since these subpatterns can match absolutely all. I don't know what are all the possible attributes of your shortcodes, but I suggest to change .*? by something more explicit. Example:

$pattern = <<<'EOD'
~
    \[tabsgroup
        \s+ # if you know that it can only be white characters 
            # otherwise use \s[^]]*?
        title="( (?> <[^>]+> | [^"] )* )"
        \s+ # the same here
        active="([^"]*)"  # no quotes allowed here
        # or you can reuse the title subpattern: active="((?1))"
        # however, if only possible values are 0 and 1: active="([01])"
     ]
     (.*?) # This can be change to ([^[]+) if you are sure to not have 
           # other tags inside.
           # If it is not the case: ( (?>[^[]+|\[(?!/tabsgroup]))* )
           # will match all except [/tabsgroup]
    \[/tabsgroup]‌
~xs
EOD;
// Once there is no more dot, don't forget to remove the s modifier
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top