Question

I have a text with many expressions like this <.....>, e.g.:

<..> Text1 <.sdfdsvd> Text 2 <....dgdfg> Text3 <...something> Text4

How can I eliminate now all brackets <...> and all commands/texts between these brackets? But the other "real" text between these (like text1, text2 above) should not be touched.

I tried with the regular expression:

<.*>

But this finds also a block like this, including the inbetween text:

<..> Text1 <.sdfdsvd>

My second try was to search for alle expressions <.> without a third bracket between these two, so I tried:

<.*[^>^<]>

But that does not work either, no change in behavior. How to construct the needed expression correctly?

Was it helpful?

Solution

This works in Notepad++:

Find what: <[^>]+?>

Replace with: nothing

Try it out: http://regex101.com/r/lC9mD4

There are a few problems with your attempt: <.*[^>^<]>

  • .* matches all characters up through the final possible match. This means that all tags except the last will be bypassed. This is called greedy. In my solution, I have changed it to possessive, which goes up to the first possible match: .*?...although I apply this to the character class itself: [^>]+?.
  • [^>^<] is incorrect for two reasons, one small, one big. The small reason is that the first caret ^ says "do not match any of the following characters", and the characters following it are >, ^, and <. So you are saying you don't want to match the caret character, which is incorrect (but not harmful). The larger problem is that this is attempting to match exactly one character, when it needs to be one or more, which is signified by the plus sign: [^><]+.

Otherwise, your attempt is not that far off from my solution.

OTHER TIPS

This seems to work:

<[^\s]*>

It looks for a left bracket, then anything that isn't whitespace between the brackets, then a right bracket. It would need some adjusting if there's whitespace between the brackets (<text1 text2>), though, and at that point a modification of one of your attempts would work better:

<[^<^>]*>

This one looks for a left bracket, then anything that isn't a left bracket or right bracket, then a right bracket.

Try <.*?>. If you don't use the "?", regular expressions will try to find the longest string that matches. Using "*?" will force to find the shortest.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top