Question

I would like my XML instance documents to consist of one or more <a/> elements followed by an equal number of <b/> elements. Here are some valid instances:

<a/><b/>, <a/><a/><b/><b/>, <a/><a/><a/><b/><b/><b/>

I want to use XML Schema 1.0 to implement it.

I tried this approach:

<xs:group name="context-free-language">
    <xs:sequence>
        <xs:element name="a" fixed="a" />
        <xs:group ref="context-free-language" minOccurs="0" />
        <xs:element name="b" fixed="b" />
    </xs:sequence>
</xs:group>

Unfortunately, circular group references are not allowed.

Any suggestions on how to implement this?

Was it helpful?

Solution 4

XML Schema 1.0 provides only partial support for context-free grammars. Regrettably, it does not support grammars that require an equal number of a's and b's. On the other hand, it does support some context-free grammars. I wrote an article which explains this: http://www.xfront.com/XML-Schema-1-0-and-Relax-NG-Partially-Support-Context-Free-Grammars.pdf

OTHER TIPS

Your requirement is to recognize a context-free language. You cannot do that with a content model in a schema language which requires content models to be regular expressions and thus to define regular not context-free languages. No schema language currently in wide use allows content models to define context-free languages.

Your options are (1) change your design to work better with the technology at your disposal instead of working against it, (2) use ad-hoc processes (like the XSLT stylesheet suggested by InfantPro'Aravind'), or (3) use assertions in Schematron or in XSD 1.1 to enforce the constraint.

This is not possible with XSD 1.0. I think not possible with 1.1 as well.

There are alternative methods like Schematron. Or one can use XSLT to transform and output the result. which inturn can be validated to see if XML is valid.

I will give a brief on this 2nd method:

Sample Input XML:

<?xml version="1.0" encoding="utf-8"?>
<root>
  <a/>
  <a/>
  <b/>
  <b/>
</root>

Sample XSLT:

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="/root">
    <validation>
      <xsl:choose>
        <!--Output 'true' if count is equal .. and 'false' otherwise-->
        <xsl:when test="count(a)=count(b)">
          <xsl:text>true</xsl:text>
        </xsl:when>
        <xsl:otherwise>
          <xsl:text>false</xsl:text>
        </xsl:otherwise>
      </xsl:choose>
    </validation>
  </xsl:template>
</xsl:stylesheet> 

Since count(a) equals count(b) This outputs:

<?xml version="1.0" encoding="utf-8"?>
<validation>true</validation>

And that inturn will be validated against:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="validation" type="xs:boolean" fixed="true"/>
</xs:schema>

which will pass in this case since <validation> node has value true

note: XSLT just creates a transformed copy which I am using for extended validation, it doesn't modify the original input.

InfantPro'Aravind' states incorrectly:

This is not possible with XSD 1.0. I think not possible with 1.1 as well.

In fact it's quite possible using XSD 1.1 assertions. Just define a content model that allows any number of As followed by any number of Bs, and then add the assertion

<xs:assert test="count(A) = count(B)"/>
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top