I would strongly recommend using an XML parser (SAX, since your document is supposedly large - it won't load all your document into memory at once but rather stream it through) and parsing it this way. You'll avoid all sort of edge cases which regular expression handlers can't handle (since XML isn't regular)
In your example above, you should likely maintain a stack of encountered XML elements, and track if <inf>
is preceeded by <titletext>