Question

I'm pretty new to the nuts and bolts of building and validating XML documents, so I hope that I'm asking a trivial question.

I've got a hierarchy of topics and sub-topics described in a document:

<topics>
  <topic> <name>First Topic</name>
     <subtopic> <name> Subtopic 1 </name> </subtopic>
     <subtopic> <name> Subtopic 2 </name> </subtopic>
  </topic>
  <topic> <name>Second Topic</name>
     <subtopic> <name> Subtopic 3 </name> </subtopic>
     <subtopic> <name> Subtopic 4 </name> </subtopic>
  </topic>
</topics>

And another document that has a topic and subtopic in it:

<mydoc>
   <topic>First Topic</topic>
   <subtopic>Subtopic 1</subtopic>
   ... rest of the doc ...
</mydoc>

I want to make sure that the only includes valid topic/subtopic combinations, and that have easily validated. I'm not sure what the approach to do this should be.

I first thought that I could define complex types in the schema to outline the possible combinations, but the first few goes at that have yielded tautologies that I've not been able to totally get my head around.

The second thought I had was as I've indicated above; put the topics/subtopics in a separate document, perhaps giving each subtopic a unique 'id' attribute. I could then instead use something like:

<mydoc subtopic_id="st4"> ... </mydoc>

I could then perhaps validated that the mydoc only contains subtopic_ids that exist in the document. But, I've been scratching my head to understand how to validate that. And, it means that I've had to create an id key that needs to be remembered by authors.

So, what's the canonical approach?

Ideally, I'd like to have someone use an XML editor, like oxygenXML, and be able to author a (generated from the schema) and have the editor help them to only enter valid topic/subtopic combinations.

Is this even possible?

I've been scratching my head on this for a while, and would very much appreciate some words of wisdom if you have any.

Was it helpful?

Solution

As Michael Kay says, your use case is not one XSD was designed to support.

Two alternatives to doing nothing (or to validating using a program in a Turing-complete language) are

  • Schematron, which allows you to make assertions about documents using the full power of XPath (or another query language - details vary by implementation), which can be regarded as a simple way to write the custom XSLT stylesheet MK suggests

  • SML (Service Modeling Language), a little-known W3C spec that can be thought of as providing extensions to XSD to support cross-document validation

OTHER TIPS

XML Schema isn't designed to validate cross-document relationships. One approach would be to build a single super-document and apply XSD validation to that; but all approaches that rely on preprocessing before validation suffer from the difficulty of producing good diagnostics when the document is invalid. So I would recommend doing the validation with a custom XSLT stylesheet.

Integrating such validation into an authoring tool looks, shall we say, challenging.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top