문제

I want to match a closing tag followed by an 0+ spaces/newlines followed by an opening tag when followed by a lowercase letter. Examples:

  • text</p> <p>blah matches </p> <p>
  • text</i><i>and more text <b>but not this</b> matches </i><i>
  • text</i> <i>And more text does not match

I tried this: </.*?>\s*\n*\s*<.*>(?=[a-z]), but it doesn't work for the second example, as it will match </i><i> and more text </b> even though the question mark should make it "lazy".

도움이 되었습니까?

해결책

Try:

</[^>]+>\s*<[^/>]+>(?=[a-z])

Change the '+' to '*' if you want to be able to match empty tags

다른 팁

Making a quantifier lazy only makes the regex try the shortest possible match first, but if that doesn't work, it will gladly expand the match until the entire regex succeeds.

You need to be more specific in what you allow to match - for example by not allowing angle brackets inside a tag:

</[^<>]*>\s*<[^/][^<>]*>(?=[a-z])

(Also, \s already contains \n, so \s*\n*\s* can be shortened to \s*)

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top