質問

I need to query XML documents using XPath expressions in a Java application. I have created the following class, which accepts a file (location of the XML document on a local hard drive) and an XPath query, and should return the result of evaluating the given query on the given document.

import java.io.File;
import java.io.IOException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathException;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;

public class XPathResolver 
{
    public String resolveXPath(File xmlFile, String xpathExpr) throws XPathException, ParserConfigurationException, SAXException, IOException
    {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        Document doc = builder.parse(xmlFile);

        XPathFactory xPathfactory = XPathFactory.newInstance();
        XPath xpath = xPathfactory.newXPath();

        XPathExpression expr = xpath.compile(xpathExpr);

        return (String) expr.evaluate(doc, XPathConstants.STRING);
    }
}

Suppose now that I have the following XML document.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Document>
    <DocumentFormat>Email</DocumentFormat>
    <FileFormat>PDF</FileFormat>
</Document>

Evaluation of both /Document/FileFormat and //FileFormat return PDF (as expected).

Suppose now, however, a document with namespace prefixes, such as the following.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Document xmlns:file="http://www.example.com/xml/file">
    <DocumentFormat>Email</DocumentFormat>
    <file:FileFormat>PDF</file:FileFormat>
</Document>

Now /Document/FileFormat returns PDF, but //FileFormat does not return anything.

Why does my code not return the expected output in case of documents with namespace prefixes and how do I fix it?

役に立ちましたか?

解決

I tried your example with a JDK 1.7.0.51 and can confirm your results. This seems a bit strange at first, but the DocumentBuilderFactory's default behaviour is to be not namespace aware.

So you have to turn it on at first:

factory.setNamespaceAware(true);

Then for the second document there are no results for the XPath expressions as expected.

You have to change your expressions to: /Document/file:FileFormat and //file:FileFormat. At a last step you have to register a NamespaceContext implementation, which maps the namespace prefixes used in your XPath expressions to namespace URIs. Unfortunately, there is no default implementation.

public String resolveXPath(File xmlFile, String xpathExpr) throws XPathException, ParserConfigurationException, SAXException, IOException, XPathExpressionException
{
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

    // Turn namespace aware on
    factory.setNamespaceAware(true);
    DocumentBuilder builder = factory.newDocumentBuilder();
    Document doc = builder.parse(xmlFile);

    XPathFactory xPathfactory = XPathFactory.newInstance();
    XPath xpath = xPathfactory.newXPath();

    // Set the NamespaceContext        
    xpath.setNamespaceContext(new MyNamespaceContext());


    XPathExpression expr = xpath.compile(xpathExpr);

    return (String) expr.evaluate(doc, XPathConstants.STRING);
}

class MyNamespaceContext implements NamespaceContext {

  private Map<String, String> ns;
  private Map<String, String> nsReverted;

  public MyNamespaceContext() {
    ns = new TreeMap<String, String>();

    // Default namespaces and prefixes according to the documentation
    ns.put(XMLConstants.DEFAULT_NS_PREFIX, XMLConstants.NULL_NS_URI);
    ns.put(XMLConstants.XML_NS_PREFIX, XMLConstants.XML_NS_URI);
    ns.put(XMLConstants.XMLNS_ATTRIBUTE, XMLConstants.XMLNS_ATTRIBUTE_NS_URI);

    // Now our self defined namespace
    ns.put("file", "http://www.example.com/xml/file");


    nsReverted = new TreeMap<String, String>();
    for(Entry<String, String> entry : ns.entrySet()) {
      nsReverted.put(entry.getValue(), entry.getValue());
    }
  }

  @Override
  public String getNamespaceURI(String prefix) {
    if(prefix == null) {
      throw new IllegalArgumentException();
    } 
    final String uri = ns.get(prefix);
    return uri == null ? XMLConstants.NULL_NS_URI : uri;
  }

  @Override
  public String getPrefix(String namespaceURI) {
    if(namespaceURI == null) {
      throw new IllegalArgumentException();
    } 
    return nsReverted.get(namespaceURI);
  }

  @Override
  public Iterator getPrefixes(String namespaceURI) {
    return ns.keySet().iterator();
  }

}

他のヒント

"Now /Document/FileFormat returns PDF" -- Given what you've shown us, it shouldn't.

To search for namespaced nodes using XPath, you must either use prefixes in the XPath and tell the XPath engine which namespaces those prefixes refer to, or kluge around by explicitly matching on the localname and namespace-uri.

See https://stackoverflow.com/questions/6390339/how-to-query-xml-using-namespaces-in-java-with-xpath

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top