Question

I'm trying to use JAXB to unmarshal an xml file into objects but have come across a few difficulties. The actual project has a few thousand lines in the xml file so i've reproduced the error on a smaller scale as follows:

The XML file:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<catalogue title="some catalogue title" 
           publisher="some publishing house" 
           xmlns="x-schema:TamsDataSchema.xml"/>

The XSD file for producing JAXB classes

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
 <xsd:element name="catalogue" type="catalogueType"/>

 <xsd:complexType name="catalogueType">
  <xsd:sequence>
   <xsd:element ref="journal"  minOccurs="0" maxOccurs="unbounded"/>
  </xsd:sequence>
  <xsd:attribute name="title" type="xsd:string"/>
  <xsd:attribute name="publisher" type="xsd:string"/>
 </xsd:complexType>
</xsd:schema>

Code snippet 1:

final JAXBContext context = JAXBContext.newInstance(CatalogueType.class);
um = context.createUnmarshaller();
CatalogueType ct = (CatalogueType)um.unmarshal(new File("file output address"));

Which throws the error:

javax.xml.bind.UnmarshalException: unexpected element (uri:"x-schema:TamsDataSchema.xml", local:"catalogue"). Expected elements are <{}catalogue>
 at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext.handleEvent(UnmarshallingContext.java:642)
 at com.sun.xml.bind.v2.runtime.unmarshaller.Loader.reportError(Loader.java:247)
 at com.sun.xml.bind.v2.runtime.unmarshaller.Loader.reportError(Loader.java:242)
 at com.sun.xml.bind.v2.runtime.unmarshaller.Loader.reportUnexpectedChildElement(Loader.java:116)
 at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext$DefaultRootLoader.childElement(UnmarshallingContext.java:1049)
 at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext._startElement(UnmarshallingContext.java:478)
 at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext.startElement(UnmarshallingContext.java:459)
 at com.sun.xml.bind.v2.runtime.unmarshaller.SAXConnector.startElement(SAXConnector.java:148)
 at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
 at com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(Unknown Source)
 at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
 at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(Unknown Source)
 at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
 at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
 at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
 at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
 at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    ...etc

So the namespace in the XML document is causing issues, unfortunately if it's removed it works fine, but as the file is supplied by the client we're stuck with it. I've attempted numerous ways of specifying it in the XSD but none of the permutations seem to work.

I also attempted to unmarshal ignoring namespace using the following code:

Unmarshaller um = context.createUnmarshaller();
final SAXParserFactory sax = SAXParserFactory.newInstance();
sax.setNamespaceAware(false);
final XMLReader reader = sax.newSAXParser().getXMLReader();
final Source er = new SAXSource(reader, new InputSource(new FileReader("file location")));
CatalogueType ct = (CatalogueType)um.unmarshal(er);
System.out.println(ct.getPublisher());
System.out.println(ct.getTitle());

which works fine but fails to unmarshal element attributes and prints

null
null

Due to reasons beyond our control we're limited to using Java 1.5 and we're using JAXB 2.0 which is unfortunate because the second code block works as desired using Java 1.6.

any suggestions would be greatly appreciated, the alternative is cutting the namespace declaration out of the file before parsing it which seems inelegant.

Was it helpful?

Solution

The thing about JAXB is, it actually implements XML and XML schema correctly. That sounds like a good thing, but as you're discovering, JAXB can often be a bit ... too literal.

So, it looks like to me that you've got an XSD that says "expect a catalogue here", and then you've got XML that says "here's a {x-schema:TamsDataSchema.xml}catalogue", and unsurprisingly JAXB gets overly anal and says "that ain't cool." There is no way to workaround this that I can see; either you have to pre-parse the XML to remove the namespace, or you need to adjust your schema to allow it.

Either solution is, as you said, inelegant, but when you're trying to fit a square peg in to a round hole sometimes you need to be a bit inelegant (and you're basically saying "fit this square/namespaced peg in to a round/non-namespaced hole", so ...)

OTHER TIPS

Thank you for this post and your code snippet. It definitely put me on the right path as I was also going nuts trying to deal with some vendor-provided XML that had xmlns="http://vendor.com/foo" all over the place.

My first solution (before I read your post) was to take the XML in a String, then xmlString.replaceAll(" xmlns=", " ylmns="); (the horror, the horror). Besides offending my sensibility, in was a pain when processing XML from an InputStream.

My second solution, after looking at your code snippet: (I'm using Java7)

// given an InputStream inputStream:
String packageName = docClass.getPackage().getName();
JAXBContext jc = JAXBContext.newInstance(packageName);
Unmarshaller u = jc.createUnmarshaller();

InputSource is = new InputSource(inputStream);
final SAXParserFactory sax = SAXParserFactory.newInstance();
sax.setNamespaceAware(false);
final XMLReader reader;
try {
    reader = sax.newSAXParser().getXMLReader();
} catch (SAXException | ParserConfigurationException e) {
    throw new RuntimeException(e);
}
SAXSource source = new SAXSource(reader, is);
@SuppressWarnings("unchecked")
JAXBElement<T> doc = (JAXBElement<T>)u.unmarshal(source);
return doc.getValue();

But now, I found a third solution which I like much better, and hopefully that might be useful to others: How to define properly the expected namespace in the schema:

<xsd:schema jxb:version="2.0"
  xmlns:xsd="http://www.w3.org/2001/XMLSchema"
  xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
  xmlns="http://vendor.com/foo"
  targetNamespace="http://vendor.com/foo"
  elementFormDefault="unqualified"
  attributeFormDefault="unqualified">

With that, we can now remove the sax.setNamespaceAware(false); line (update: actually, if we keep the unmarshal(SAXSource) call, then we need to sax.setNamespaceAware(true). But the simpler way is to not bother with SAXSource and the code surrounding its creation and instead unmarshal(InputStream) which by default is namespace-aware. And the ouput of a marshal() also has the proper namespace too.

Yeh. Only about 4 hours down the drain.

How to ignore the namespaces

You can use an XMLStreamReader that is non-namespace aware, it will basically trim out all namespaces from the xml file that you're parsing:

JAXBContext jc = JAXBContext.newInstance(your.ObjectFactory.class);
XMLInputFactory xif = XMLInputFactory.newFactory();
xif.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); // this is the magic line
StreamSource source = new StreamSource(f);
XMLStreamReader xsr = xif.createXMLStreamReader(source);
Unmarshaller unmarshaller = jc.createUnmarshaller();
Object unmarshal = unmarshaller.unmarshal(xsr);

Now the actual xml that gets fed into JAXB doesn't have any namespace info.


Important note (xjc)

If you generated java classes from an xsd schema using xjc and the schema had a namespace defined, then the generated annotations will have that namespace, so delete it manually! Otherwise JAXB won't recognize such data.

Places where the annotations should be changed:

  • ObjectFactory.java

    // change this line
    private final static QName _SomeType_QNAME = new QName("some-weird-namespace", "SomeType");
    // to something like
    private final static QName _SomeType_QNAME = new QName("", "SomeType", "");
    
    // and this annotation
    @XmlElementDecl(namespace = "some-weird-namespace", name = "SomeType")
    // to this
    @XmlElementDecl(namespace = "", name = "SomeType")
    
  • package-info.java

    // change this annotation
    @javax.xml.bind.annotation.XmlSchema(namespace = "some-weird-namespace", elementFormDefault = javax.xml.bind.annotation.XmlNsForm.QUALIFIED)
    // to something like this
    @javax.xml.bind.annotation.XmlSchema(namespace = "", elementFormDefault = javax.xml.bind.annotation.XmlNsForm.QUALIFIED)
    

Now your JAXB code will expect to see everything without any namespaces and the XMLStreamReader that we created supplies just that.

Here is my solution for this Namespace related issue. We can trick JAXB by implementing our own XMLFilter and Attribute.

class MyAttr extends  AttributesImpl {

    MyAttr(Attributes atts) {
        super(atts);
    }

    @Override
    public String getLocalName(int index) {
        return super.getQName(index);
    }

}

class MyFilter extends XMLFilterImpl {

    @Override
    public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException {
        super.startElement(uri, localName, qName, new VersAttr(atts));
    }

}

public SomeObject testFromXML(InputStream input) {

    try {
        // Create the JAXBContext
        JAXBContext jc = JAXBContext.newInstance(SomeObject.class);

        // Create the XMLFilter
        XMLFilter filter = new VersFilter();

        // Set the parent XMLReader on the XMLFilter
        SAXParserFactory spf = SAXParserFactory.newInstance();
        //spf.setNamespaceAware(false);

        SAXParser sp = spf.newSAXParser();
        XMLReader xr = sp.getXMLReader();
        filter.setParent(xr);

        // Set UnmarshallerHandler as ContentHandler on XMLFilter
        Unmarshaller unmarshaller = jc.createUnmarshaller();
        UnmarshallerHandler unmarshallerHandler = unmarshaller
                .getUnmarshallerHandler();
        filter.setContentHandler(unmarshallerHandler);

        // Parse the XML
        InputSource is = new InputSource(input);
        filter.parse(is);
        return (SomeObject) unmarshallerHandler.getResult();

    }catch (Exception e) {
        logger.debug(ExceptionUtils.getFullStackTrace(e));
    }

    return null;
}

There is a workaround for this issue explained in this post: JAXB: How to ignore namespace during unmarshalling XML document?. It explains how to dynamically add/remove xmlns entries from XML using a SAX Filter. Handles marshalling and unmarshalling alike.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top