ADBlock is blocking some content on the site

ADBlock errore

Using Regular expressions in java to extract contents of xml tag

StackOverflow https://stackoverflow.com/questions/14276257


I have a string which is huge and a part of it contains as the following :

<df>asdffs</df><titletext xml:lang="eng" original="y">Dose intensity <inf>low</inf> in advanced cancer: Have we answered the question?</titletext><sdf>gfdgas</sdf>

I need to find if <inf> tag exists in the <titletext> tag. I am writing it in Java.

Thanks in advance.

No correct solution


I would strongly recommend using an XML parser (SAX, since your document is supposedly large - it won't load all your document into memory at once but rather stream it through) and parsing it this way. You'll avoid all sort of edge cases which regular expression handlers can't handle (since XML isn't regular)

In your example above, you should likely maintain a stack of encountered XML elements, and track if <inf> is preceeded by <titletext>

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow