Question

I'm trying to match the following:

<h4><a href="#"></a>
        Bartal, Association            </h4>
        -->
        <p>Travis House                 
        <p><b>City</b> :
        <em>Austin</em></p>

N.B. that the part "<p>Travis House" is sometimes there and sometimes not.

I am using the following regex:

~<!--.+?<h4>(.+?)<\/h4>.+?(?:<p>(.+))?.+?<p><b>City<\/b>.+?<em>(.+?)<\/em>~is

It works fine, but it does not match the part '<p>Travis House'.

Can anyone assist?

Was it helpful?

Solution

This works:

~--\s+<h4>(.+?)<\/h4>.+?(?:<p>(.+?)\n)?\s+<p><b>City<\/b>.+?<em>(.+?)<\/em>~is

It appears the if or nothing statement:

(?:<p>(.+))?

didn't work because it was surrounded by two ungreedy dot alls.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top