Question

I have been brushing up on my XML schema skills the past few days and during a whole day I was busy trying to understand the intricacies of namespaces with respect to schemas. What struck me most was the seemingly uselessness of the form="qualified|unqualified" attribute on non-global <element> and <attribute> elements.

My question is: does the form attribute actually add expressivity to XML schemas / XML documents, or does it just make the notation of certain XML documents easier/different?

I understand that XML documents that need to conform to a certain schema are generally easier to write when all elements are to be qualified with a namespace (one xmlns="xyz" attribute on the document element is all you need), but is that all? Why would anyone bother with unqualified non-global elements at all?

Was it helpful?

Solution

First, on the formal question: does the form attribute actually add to the expressivity of XSD? I think so: with the form attribute I can write types whose sets of valid instances cannot (as far as I can see) be matched by any type written without using the form attribute.

For example (in a schema which defines a top-level element named a of type duration):

<choice maxOccurs="unbounded">
  <element form="qualified" name="a" type="integer"/> 
  <element form="unqualified" name="a" type="gYear"/> 
</choice>

Then, on the less formal question: why would anyone bother?

The short answer is: because for each possible way of choosing among qualified and unqualified names for local elements and attributes, some people think that that is the right way to declare local elements and/or attributes.

A longer answer will take a little time. Sit down, get yourself a cup of coffee.

There are two schools of thought on local elements; both were represented in the working group that designed XSD.

One school of thought believes that if element P is in namespace N, and element C is local to the type of element N:P, then it is only natural that the child element should be named N:C. It is, after all, part of the same vocabulary as P, and identifying the vocabulary is what namespaces are all about. From your final question, I guess you lean toward this way of seeing things.

The other school of thought reasons that local elements are like local attributes. An attribute local to (the type of) element N:P is named A, not N:A -- the name N:A denotes, by definition, an attribute global to namespace N, not local to element N:P. By analogy, local children should also use unqualified names, so that attributes and child elements are treated in similar ways.

The presence of the form attribute on XSD element and attribute declarations might suggest the possible existence of a third school of thought, characterized by a desire to treat the choice between qualified and unqualified names as a design choice to be taken individually for each local element or attribute, and not necessarily in a single vocabulary-wide edict. For what it's worth, this third school of thought doesn't actually seem to exist. At least, I don't think I've ever encountered a member. No one ever seems to set out to write complex types of the kind exhibited above with mixtures of qualified and unqualified local names. The essential function of the form attribute is not to allow different local elements to be qualified or unqualified, but to have its default set by the elementFormDefault and attributeFormDefault attributes on the enclosing schema element, thus ensuring that even if schema authors are somehow stuck with the wrong values for those attributes, they can still get the effect they desire.

I have also never (that I know of) encountered any member of either of the first two schools of thought who could feel any sympathy at all for the reasoning of the other school. That anyone could think the way the members of the other school of thought think has pretty much always come as an unwelcome surprise. With a little effort, smart people of good will find it possible to accept the existence of the other school of thought, and even (with a little more effort) to accept that the members of that school of thought are arguing in good faith, and are not just trying to gum up the works. The variety of views has anthropological interest at best (look at the very odd things people can claim to believe even when otherwise they seem mostly like more or less rational beings! Funny old world, isn't it?). After some months of deadlock most members of the working group were forced to admit to themselves that they just were not going to be able to persuade the other guys to see the error of their ways.

It then became clear that pretty much everyone in the WG had the same ranking for the three possibilities we could think of:

  1. Define things the way I think is right.
  2. Define things so that the schema author must make the choice.
  3. Define things the way the other guys think is right.

Everyone liked the first choice (for suitable values of "I"), but for different WG members "the way I think is right" turned out to denote different things.

No one much liked the second choice, since it makes the schema author's life harder and leads to less consistency in the universe of schemas.

But everyone hated the third choice (being forced to do things the way the other guys wanted) so much that they were willing to accept compromise rather than risk utter defeat.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top