Question

I'm implementing schema validation using libxml2. The schema I'm validating against imports two other schemas with lines like:

<xs:import namespace="http://www.w3.org/XML/1998/namespace"
           schemaLocation="http://www.somewebsite.com/xsd/xml.xsd"/>

All three schema files are located in the same directory on the device.

This works well when the device has internet access, but fails when it does not, as libxml2 still attempts to download the imported schemas from the schemaLocation even though I'm passing in XML_PARSE_NONET.

I tried getting libxml2 to load the files locally by editing the schemaLocation attribute to xml.xsd, ./xml.xsd, and file:///data/data/com.company.appname/files/xml.xsd, but all three resulted in the same libxml2 error:

  • domain: 16
  • code: 3069 (XML_SCHEMAP_INTERNAL)
  • message: Internal error: xmlSchemaParse, An internal error occurred.

I also tried removing the schemaLocation attribute entirely, on the off-chance that libxml2 might search for the imported schemas alongside the original schema, but that resulted in the following error when the schema parser hit a line that referenced the imported entities:

<xs:attribute ref="xml:lang" use="required"/>
  • domain: 16
  • code: 3004 (XML_SCHEMAP_SRC_RESOLVE)
  • message: attribute use (unknown), attribute 'ref': The QName value '{http://www.w3.org/XML/1998/namespace}lang' does not resolve to a(n) attribute declaration.

I also looked into manually merging the three schemas into a single file, but as they use different namespaces, this is not possible.

The standard solution for this seems to be the XML catalog, but I've read through libxml2's catalog documentation, and I can't figure out how (or even whether it's possible) to add mappings that will be used by my app when deployed to a device. I think I might need to implement an xmlExternalEntityLoader, but the documentation for that is quite slim.

How can I get libxml2 to import these schemas without network access? Obviously I'd ideally like a robust solution that works with the unedited schema, but I've be content with something quick-and-dirty that involves editing the schema, like my original attempts described above.

The errors described above are from an Android device (using JNI), but I'm having similar problems on iOS, where the solution will also need to work.

Was it helpful?

Solution

One way to do this is to intercept libxml2's call to open the imported URL with a custom xmlExternalEntityLoader.

The basic code for doing this is as follows:

#include <libxml/xmlIO.h>
#include <libxml/parserinternals.h>

xmlExternalEntityLoader defaultLoader = NULL;

xmlParserInputPtr
xmlMyExternalEntityLoader(const char *URL, const char *ID,
                          xmlParserCtxtPtr ctxt) {
    xmlParserInputPtr ret;
    const char *fileID = NULL;
    /* lookup for the fileID
     * The documentation suggests using the ID, but for me this was
     * always NULL so I had to lookup by URL instead.
     */

    ret = xmlNewInputFromFile(ctxt, fileID);
    if (ret != NULL)
        return(ret);
    if (defaultLoader != NULL)
        ret = defaultLoader(URL, ID, ctxt);
    return(ret);
}

int main(..) {
    ...

    /*
     * Install our own entity loader
     */
    defaultLoader = xmlGetExternalEntityLoader();
    xmlSetExternalEntityLoader(xmlMyExternalEntityLoader);

    ...
}

(Slightly adjusted from the sample code in The entities loader section of libxml2's I/O Interfaces documentation.)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top