Question

I have two sentences as input. Let's say for example:

<span>I love my red car.</span>
<span>I love my car.</span>

Now I want to match every textpart inside the span-tags AND if available the color.

If I use the following regex:

/<span>(.*?)(?P<color>red)(.*?)<\/span>/ms

Only the line with the color is matched. So I thought let's use ?-operator (for one or zero).

/<span>(.*?)(?P<color>red)?(.*?)<\/span>/ms

Now both lines/sentences will be matched. Sadly the color isn't matched anymore.

The question is why? By using ".*?" before the color part, I thought I had made the regex non-greedy, so that the color part would match, if it's existent. But as told, it doesn't...

Was it helpful?

Solution

The first (.*?) will match between > and I and since it's lazy, it'll test the next part of the regex immediately: (?P<color>red)? but there's no red at that point, so the 0 option of ? 'activates' and the regex continues to the next part, which is (.*?). It'll again match the part between > and I and since it's lazy, it'll check the next part of the regex: <\/span> (I'm taking it as a whole).

So the second (.*?) will match all the way there.

Indeed, your results[1] will be null, as will be results[color] (I don't remember if you have to quote color or not) and results[3] will contain I love my red car..

Hmm, one workaround is to use OR like NickC mentioned in his answer. Another you might use is by using a negative lookahead to check for each character:

<span>((?:(?!\bred\b).)*(?<colour>\bred\b)?.*)<\/span>

regex101 demo

As a side note, I would advise using the word boundaries so that you don't match things like reduce or jarred.

OTHER TIPS

This should work:

/<span>(.*?(?P<color>red).*?|.*?)<\/span>/ms

Your original expression was pretty good. I modified it slightly to make a new outer group match the whole sentence. I used that new outer group to create an "or" condition to match "anything", in case the color is not present.

Abbreviated output:

Array
    [0] => Array
            [0] => <span>I love my red car.</span>
            [1] => <span>I love my car.</span>

    [1] => Array
            [0] => I love my red car.
            [1] => I love my car.

    [color] => Array
            [0] => red
            [1] => 
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top