Question

I'm trying to edit a large XML file (extracted from an Excel .xlsm file) using PHP, and I was wondering which is best (regarding performance) : QueryPath or PHP's DOMDocument ?

The file weighs at least 8 MB, and contains around 400k lines (when formatted).

Thanks for the feedback

Was it helpful?

Solution

QueryPath is basically just a wrapper around DOMDocument. It adds relatively little overhead to a bare DOMDocument object. For accessing and writing operations -- things like attr(), append(), and such, there should be no noteworthy performance difference.

But then it comes to the big issue: Finding stuff.

Traditionally, traversing a DOMDocument is done by either "walking the tree" or using DOMNode->getElementsByTagname(). This preforms relatively well if you're willing to write the code.

Querying with QueryPath 2.x will be sorta slow on a document that size unless you use very specific selectors (e.g ':root>foo>bar>baz').

However, QueryPath 3.x, which is about to go into Alpha1 is many, many times faster when querying large objects. Doing qp('foo') is as fast as XPath... which brings me to the last option.

Then there's the built-in XPath processor that also comes with PHP's libxml support. That might give you better performance if you're doing a large XML document, since it is executed at C speed instead of at PHP speed. But you will have to write XPath expressions, which are (IMHO) sort of a pain.

So the bottom line:

  • Basics: Either one will do.
  • Modification: Either one will do.
  • Lots of traversing:
    • DOMDocument will make you traverse manually.
    • QueryPath 2.x is slow
    • QueryPath 3.x is much faster
    • XPath is fastest... but it's XPath
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top