Question

I want to parse some HTML in order to find the values of some attributes/tags etc.

What HTML parsers do you recommend? Any pros and cons?

Was it helpful?

Solution

NekoHTML, TagSoup, and JTidy will allow you to parse HTML and then process with XML tools, like XPath.

OTHER TIPS

I have tried HTML Parser which is dead simple.

Do you need to do a full parse of the HTML? If you're just looking for specific values within the contents (a specific tag/param), then a simple regular expression might be enough, and could very well be faster.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top