質問

I want to match a closing tag followed by an 0+ spaces/newlines followed by an opening tag when followed by a lowercase letter. Examples:

  • text</p> <p>blah matches </p> <p>
  • text</i><i>and more text <b>but not this</b> matches </i><i>
  • text</i> <i>And more text does not match

I tried this: </.*?>\s*\n*\s*<.*>(?=[a-z]), but it doesn't work for the second example, as it will match </i><i> and more text </b> even though the question mark should make it "lazy".

役に立ちましたか?

解決

Try:

</[^>]+>\s*<[^/>]+>(?=[a-z])

Change the '+' to '*' if you want to be able to match empty tags

他のヒント

Making a quantifier lazy only makes the regex try the shortest possible match first, but if that doesn't work, it will gladly expand the match until the entire regex succeeds.

You need to be more specific in what you allow to match - for example by not allowing angle brackets inside a tag:

</[^<>]*>\s*<[^/][^<>]*>(?=[a-z])

(Also, \s already contains \n, so \s*\n*\s* can be shortened to \s*)

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top