Question

I use xslt 1.0 to do some manipulations on xhtml file. But I wanted to start from an identical copy. To my surprise xsl adds attributes that were absent in the original file. Please explain this phenomenon. I would rather avoid it to make it easier to compare source and result files.

I tried both xsltproc and msxsl. No difference. I get rowspan and colspan added to all td elements.

Input:

<?xml version="1.0" encoding="windows-1250" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1250" />
<title>Anything</title>
</head>

<body>
<table>
<tr><td class="skl" >test</td><td class="kwota" >1 800,00</td></tr>
</table>
</body>                    

</html>

xslt:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0"
  >
  <xsl:output method="xml"
    omit-xml-declaration="no"
    encoding="windows-1250"
    doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
    doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
  />

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates
        select="node()|@*|processing-instruction()|comment()" />
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

And the only difference is this line:

<tr><td class="skl" rowspan="1" colspan="1">test</td><td class="kwota" rowspan="1" colspan="1">1 800,00</td></tr>

Validation of source file against the dtd shows no errors. I can insert these attributes into the source file to workaround the problem, but I'm curious about the cause of this mess.

Edit: I use original dtd downloaded (with a 20 seconds delay) from
http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd

<!ATTLIST td
  %attrs;
  abbr        %Text;         #IMPLIED
  axis        CDATA          #IMPLIED
  headers     IDREFS         #IMPLIED
  scope       %Scope;        #IMPLIED
  rowspan     %Number;       "1"
  colspan     %Number;       "1"
  %cellhalign;
  %cellvalign;
  >
Was it helpful?

Solution

Your XSLT processors are behaving perfectly correctly. No new attributes are being added. The rowspan attributes were always in your input file via the DTD reference. Whether the value of "1" for a rowspan is serialized as an explicit attribute or implied by your doctype declaration makes no difference to the model data.

The ATTLIST above shows that the rowspan and the colspan have a default value of 1. There is no way not to have these attributes and still conform to XHTML 1.1 strict. The other attributes annotated as #IMPLIED means they are optional.

I hope that explains it.

OTHER TIPS

Several ways of disabling the "feature" in the processors I was able to test.

libxml

xsltproc: --nodtdattr

libxslt / libxml: don't specify XML_PARSE_DTDATTR when loading the source, for example in xmlReadFile

msxml

msxsl: -xe - don't resolve externals

Msxml.DomDocument: doc.resolveExternals = False and doc.validateOnParse = False before load, also disables whole dtd

resolveExternals:

In MSXML 3.0 and MSXML 6.0 the default resolveExternals value is True. In MSXML 6.0, the default setting is False.

Yeah, that's stupid. But I only copied it from MS. Should be 3.0 and 4.0 True, 6.0 False I guess.

PopulateElementDefaultValues Property introduced in 6.0 SP1 has an attractive description, but it doesn't work for me with dtds.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top