Question

I want to use JDOM to read in an XML file, then use XPath to extract data from the JDOM Document. It creates the Document object fine, but when I use XPath to query the Document for a List of elements, I get nothing.

My XML document has a default namespace defined in the root element. The funny thing is, when I remove the default namespace, it successfully runs the XPath query and returns the elements I want. What else must I do to get my XPath query to return results?

XML:

<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.foo.com">
<dvd id="A">
  <title>Lord of the Rings: The Fellowship of the Ring</title>
  <length>178</length>
  <actor>Ian Holm</actor>
  <actor>Elijah Wood</actor>
  <actor>Ian McKellen</actor>
</dvd>
<dvd id="B">
  <title>The Matrix</title>
  <length>136</length>
  <actor>Keanu Reeves</actor>
  <actor>Laurence Fishburne</actor>
</dvd>
</collection>

Java:

public static void main(String args[]) throws Exception {
    SAXBuilder builder = new SAXBuilder();
    Document d = builder.build("xpath.xml");
    XPath xpath = XPath.newInstance("collection/dvd");
    xpath.addNamespace(d.getRootElement().getNamespace());
    System.out.println(xpath.selectNodes(d));
}
Was it helpful?

Solution

XPath 1.0 doesn't support the concept of a default namespace (XPath 2.0 does). Any unprefixed tag is always assumed to be part of the no-name namespace.

When using XPath 1.0 you need something like this:

public static void main(String args[]) throws Exception {
    SAXBuilder builder = new SAXBuilder();
    Document d = builder.build("xpath.xml");
    XPath xpath = XPath.newInstance("x:collection/x:dvd");
    xpath.addNamespace("x", d.getRootElement().getNamespaceURI());
    System.out.println(xpath.selectNodes(d));
}

OTHER TIPS

I had a similiar problem, but mine was that I had a mixture of XML inputs, some of which had a namespace defined and others that didn't. To simplify my problem I ran the following JDOM snippet after loading the document.

for (Element el : doc.getRootElement().getDescendants(new ElementFilter())) {
    if (el.getNamespace() != null) el.setNamespace(null);
}

After removing all the namespaces I was able to use simple getChild("elname") style navigation or simple XPath queries.

I wouldn't recommend this technique as a general solution, but in my case it was definitely useful.

You can also do the following

/*[local-name() = 'collection']/*[local-name() = 'dvd']/

Here is list of useful xpath queries.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top