Try:
</[^>]+>\s*<[^/>]+>(?=[a-z])
Change the '+' to '*' if you want to be able to match empty tags
Question
I want to match a closing tag followed by an 0+ spaces/newlines followed by an opening tag when followed by a lowercase letter. Examples:
text</p> <p>blah
matches </p> <p>
text</i><i>and more text <b>but not this</b>
matches </i><i>
text</i> <i>And more text
does not matchI tried this: </.*?>\s*\n*\s*<.*>(?=[a-z])
, but it doesn't work for the second example, as it will match </i><i> and more text </b>
even though the question mark should make it "lazy".
Solution
Try:
</[^>]+>\s*<[^/>]+>(?=[a-z])
Change the '+' to '*' if you want to be able to match empty tags
OTHER TIPS
Making a quantifier lazy only makes the regex try the shortest possible match first, but if that doesn't work, it will gladly expand the match until the entire regex succeeds.
You need to be more specific in what you allow to match - for example by not allowing angle brackets inside a tag:
</[^<>]*>\s*<[^/][^<>]*>(?=[a-z])
(Also, \s
already contains \n
, so \s*\n*\s*
can be shortened to \s*
)