Question

I have this sample string :

≪! [If Gte Mso 9]>≪Xml>  ≪Br /> ≪O:Office Document Settings>  ≪Br /> ≪O:Allow Png/>  ≪Br /> ≪/O:Off...

And I would like to target specifically anything that begins in an "≪" and ends in a ">", and replace it with no-space "".

Been using Rubular, but I'm having a tricky time learning how to set this one up.

Any idaes?

Was it helpful?

Solution

result = subject.gsub(/≪[^>]*>/, '')

should do the trick.

[^>]* means: Match any number of characters except >.

OTHER TIPS

The patter is simple as that:

≪[^>]*>

Just a helpful hint, I use Rubular to help with regex writing a debugging.

It sure looks like you're trying to parse XML with regular expressions, which is a very difficult and fragile way to extract the data you need from that document.

You might be better off parsing it and selecting the information you need using XPath or DOM.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top