One approach could be to validate a few things before parsing it. You could use a regex to validate the XML tags, but perhaps more easier could be a Stack
where you add every <
and >
symbol on. Afterwards just loop trough it and assert that you don't get the same symbol twice in a row.
This raises the question: how do you distinguish between <MyElement>>
and <MyEl>ement>
?
This is all pretty vague though: what do you want to happen when the XML turns out to be invalid? How far do you want to take this pre-processing validation?
I believe that the best option here is to not proceed. You can't fix every issue with malformed XML thrown at you and it might just be better to inform the user and make that the end.
If the source is consistently sending malformed XML at you, you'll have to contact the maintainers or look for alternatives.