What php html tokenizer's can I use?
Question
I need to process html submitted in my web application and don't want to munge the whole thing with regular expressions. What tokenizer approach and/or software should I take?
Solution
I would use the DOMDocument::loadHTML method to load the HTML document. And if you want a simpler handling than the DOMDocument methods, you can convert it to a SimpleXML object by using simplexml_import_dom()
.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow