partial schema included in multiple subschemas

https://stackoverflow.com/questions/17156899

01-06-2022
|

Question

My aim is to make a modular XML schema that has some shared types in one file available to all subschema files. What's the best way to go around this?

Example:

Say I want to build an XML schema which describes XML documents about cars and bikes. I then create a schema for the XML, which I divide up into 4 files: vehicles.xsd, cars.xsd, bikes.xsd and types.xsd. vehicles.xsd includes cars.xsd and bikes.xsd and they both in turn include types.xsd. I noticed when trying out this example with the command

xmllint --schema vehicles.xsd vehicles.xml

that it validates fine, even though I was expecting a conflict to arise because of the double inclusion of types.xsd (which leads to 2 definitions of the complexType vehicleType). Removing the <include> tag from either cars.xsd or bikes.xsd also validates just fine. Can someone explain to me what is going on here?

XML and XSDs:

vehicles.xml:

<vehicles xmlns="http://example.com/vehicles">
  <cars>
    <car make="Porsche" model="911" />
    <car make="Porsche" model="911" />
  </cars>
  <bikes>
    <bike make="Harley-Davidson" model="WL" />
    <bike make="Yamaha" model="XS650" />
  </bikes>
</vehicles>

vehicles.xsd:

<xs:schema
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:vh="http://example.com/vehicles"
  targetNamespace="http://example.com/vehicles"
  elementFormDefault="qualified">

  <xs:include schemaLocation="cars.xsd" />
  <xs:include schemaLocation="bikes.xsd" />

  <xs:element name="vehicles">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="vh:cars" />
        <xs:element ref="vh:bikes" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

cars.xsd:

<xs:schema
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:vh="http://example.com/vehicles"
  targetNamespace="http://example.com/vehicles"
  elementFormDefault="qualified">

  <xs:include schemaLocation="types.xsd" />

  <xs:element name="cars">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="car" type="vh:vehicleType"
          minOccurs="0" maxOccurs="unbounded" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

bikes.xsd:

<xs:schema
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:vh="http://example.com/vehicles"
  targetNamespace="http://example.com/vehicles"
  elementFormDefault="qualified">

  <xs:include schemaLocation="types.xsd" />

  <xs:element name="bikes">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="bike" type="vh:vehicleType"
          minOccurs="0" maxOccurs="unbounded" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

types.xsd

<xs:schema
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  targetNamespace="http://example.com/vehicles">

  <xs:complexType name="vehicleType">
    <xs:attribute name="make" type="xs:string" />
    <xs:attribute name="model" type="xs:string" />
  </xs:complexType>
</xs:schema>

Solution

Most XSD processors notice, when asked to include a schema document like types.xsd, when they have already included it, and they avoid including it a second time; the XSD spec explicitly encourages this. That is why you are not getting error messages over the double inclusion, and why a single inclusion works fine for the merged case.

In general, however, there is slightly better interoperability among XSD processors if you keep things simpler by doing all inclusions from a single top-level driver file. If you used that idiom, you'd drop the xs:include elements from all your schema documents, and make one or more new driver documents which contain nothing but inclusions (one if you only want one schema; multiple driver documents if you want different schemas with different sets of elements).

Similar considerations apply to the use of the schemaLocation attribute on xs:import elements. The use of this idiom helps avoid situations (especially situations involving redefinition and reference cycles) which produce dramatically different results from different processors.

OTHER TIPS

From the W3C specs for XML schema, i believe it was the intention to be able to have modular schemas and that can be included them where required. The preprocessor for the xml processor should resolve all includes and "import"s first.

I do have a extensive schema, that are broken up into small modular pieces. This works well on many xlm processors , and xml editors.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow