Question

Is it possible that XSLT "select" and "match" (and probably more) attributes are validated against input data XSD schema?

For example, if my XSD schema defines input XML root element named "realRoot"

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema version="1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:element name="realRoot" type="rootType"/>

  <xs:complexType name="rootType">
    <xs:sequence>
      ...
    </xs:sequence>
  </xs:complexType>
</xs:schema>

then XSL

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xsl:my="http://example.org/my-schema"
  xsi:schemaLocation="http://example.org/my-schema my-schema.xsd">

  <xsl:template match="/my:fakeRoot">
    ...
  </xsl:template>

</xsl:stylesheet>

would fail fast, because no fakeRoot element is defined in the schema.

It could help to discover mistakes faster, could refactor XML schemas easier, and implement auto-completion in IDE for these XSLT attributes.

Was it helpful?

Solution 2

Saxon-EE, with schema-awareness switched on, will check your select expressions and match patterns against the schema. However, it requires a bit more tweaking than you suggest.

Writing <xsl:template match="/my:fakeRoot"> won't be rejected, even though there is no fakeRoot element in your schema, because it's perfectly legitimate for a stylesheet to create elements that aren't valid against the schema and then process them (perhaps to make them valid). It will be rejected, however, if you write it as <xsl:template match="/schema-element(my:fakeRoot)"> because then you're writing a pattern that will only match elements defined in the schema. Similarly an expression like $x//svg:polygno (with a misspelled element name) will be rejected only if the type of $x is declared, e.g. using as="schema-element(svg)"

There's a new option in XSLT 3.0 (new draft just out) (not yet implemented in Saxon)

which will cause <xsl:template match="/my:fakeRoot"> to have the same meaning as <xsl:template match="/schema-element(my:fakeRoot)">, and therefore to be rejected if there's no element of this name in the schema. And of course it will also work on more complex patterns, for example match="table/td" will be rejected if td cannot appear as a child of table.

My experience of using schema-aware stylesheet development is that it can make debugging an awful lot easier, especially if you are working with a very complex vocabulary. However, there is an up-front cost in declaring all your types, and this means many people are not getting the full benefits. Hopefully the new options will make it more accessible.

OTHER TIPS

Yes, it's possible in principle. In practice, I don't know any XSLT processors that perform such analysis, and when I have heard research papers given on this kind of thing, the message I have always taken away is "Wow, that gets complicated fast!"

Some complicating factors:

  • XSD doesn't provide an unambiguous way to identify specific top-level elements as potential document root elements, so the premise for your imaginary story is already a bit iffy.

  • XSLT is designed to work with any well-formed XML (or, strictly speaking, any instantiation of the XPath data model that the XSLT processor can read), so a stylesheet could assume validity in the input only in a special mode of operation, not defined in the spec.

  • The template-driven flow of control in XSLT makes it very difficult to generate tight descriptions of the possible input nodes which will be current when particular templates are evaluated; the result is a very large space of possible states, which makes it very hard to generate good guarantees for patterns the processor could exploit.

All that said, it would certainly be possible for a processor to look at match patterns and select expressions in a stylesheet and say, for each one, whether it could in principle match any nodes in an input document valid against a given schema. That would be a lot more tractable, could still be useful, and might make a good student project.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top