Question

I have a string like the following

~~<b>A<i>C</i></b>~~/~~<u>D</u><b>B</b>~~has done this.

I am trying to get the text inside <b> tag. I am trying

<b>(.+)</b>

But I am getting <b>A<i>C</i></b>~~/~~<u>D</u><b>B</b>, but I need <b>A<i>C</i></b> as first match and <b>B</b> as the second match

Can anyone please help?

Was it helpful?

Solution

You need to use a non-greedy quantifier:

<b>(.+?)</b>

This will ensure that the match stops at the first </b> it finds.

However, I would generally recommend using a proper XML or HTML parser for this sort of thing. Regular expressions are simply not powerful enough to handle the recursive structure of XML.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top