Question

We need to find all prefix:namespace pairs in an XML file. We allow users to supply any XML file and any XPath to query on it. We need to find the prefix:uri mappings to set those when the XPath is evaluated.

We presently use:

selectNodes("//namespace::*[name() != 'xml'][not(../../namespace::*=.)]");

and this does return all pairs. The problem is that it is slow. I looked at this answer but it also is slow. Is there a fast way to do this? And I need this solely to perform XPath queries against the XML.

I'm doing this both in Java (using dom4j) and .NET.

thanks - dave

No correct solution

OTHER TIPS

You will not be able to do lots of changes to the code provided in the answers to the linked question provided by Michael Kay and Dimitre Novatchev.

This code (also theirs) touches each node (elements and attributes) exactly once, so runtime for everything inside distinct-values() is O(n) on the number of nodes. In worst case, each node has some namespace attached, so you will have to sort these n nodes which is O(n*log n) for any reasonable sort algorithm.

(: each namespace:uri-combination only once :)
distinct-values(
  (: analyze all nodes with namespace set, both attributes and elements :)
  /descendant-or-self::*/(.|@*)[namespace-uri(.)]
  (: build result string :)
  /concat(
    substring-before(name(), ':'), ': ', namespace-uri(.), '
'
  )
)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top