Question

I'm trying to read and validate xml document that has external entites. But I have no succes with both reading and validating. I used this to create a test example.

Test xml:

<?xml version="1.0" standalone="no" ?>
<!DOCTYPE doc [
<!ENTITY otherFile SYSTEM "otherFile.xml">
]>
<doc>&otherFile;</doc>

Other xml:

<baz>this is my content</baz>

Test xsd:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="doc">
    <xs:complexType>
    <xs:sequence>
      <xs:element ref="baz"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="baz" type="xs:string"/>

</xs:schema>

First, I try to read the contents of test.xml using QDomDocument:

QDomDocument doc;
doc.setContent(&testFile);
qDebug() << doc.toString();

But in the debug output I get raw text from test.xml. The external entity is not substituted.

Then I try to validate test.xml against test.xsd:

QXmlSchema schema;
bool res = schema.load(&xsdFile, QUrl::fromLocalFile(xsdPath));
if (res == true)
{
    QXmlSchemaValidator validator(schema);
    if (validator.validate(&xmlFile, QUrl::fromLocalFile(xmlPath)))
    {
        qDebug() << "xml" << xmlName << "is valid";
    }
    else
    {
    qDebug() << "xml" << xmlName << "is invalid";
    }
}

But validation fails and I get the following error:

Error XSDError in file:///..., at line 5, column 5: Element doc is missing child element.

Am I doing something wrong or Qt Xml module just doesn't support external entities?

Was it helpful?

Solution

I've had a look into this for you, and the short answer is that I think you may need to change to using a different parser and validator if you want DTD SYSTEM entity support.

Qt 4 has three different XML parsers:

Having 3 different types of parser was decided to be too complicated, so with the move to Qt 5 the XML module has been deprecated and the recommended parser is now QXmlStreamReader. This is quite an easy parser to use (unlike QXmlSimpleReader), but has far lower memory use than QDomDocument.

Therefore, if you are writing new code in Qt for XML parsing, even if you aren't currently using Qt 5, I would strongly recommend using QXmlStreamReader.

Unfortunately for you, the manual page notes:

QXmlStreamReader is a well-formed XML 1.0 parser that does not include external parsed entities.

This means, it doesn't resolve your SYSTEM entities. Also, after inspecting the source code, it doesn't look like there are any 'hidden' hooks or methods you can use to intercept the entity resolution.

If you want to include external XML documents into another document, you may want to look at using XInclude. It would be reasonably simple to write an XInclude processor using QXmlStreamReader and QXmlStreamWriter.

Here is a basic Qt XInclude processor I wrote, it only does one level of include, but I'm sure you could extend it to support recursive inclusion reasonably easily.

Once you have a fully resolved XML document, you should be able to use the QXmlSchemaValidator to validate it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top