Question

EDIT:

This is the smallest test case in lxml that I can come up with (written totally in Python)

from lxml import etree

xslt_tree = etree.XML('''\
<?xml version="1.0" encoding="UTF-8"?>
<MD_Metadata xmlns="http://www.isotc211.org/2005/gmd" xmlns:gco="http://www.isotc211.org/2005/gco">
  <language/>
  <characterSet/>
  </MD_Metadata>''')

doc = etree.XML('''\
  <xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:gmd="http://www.isotc211.org/2005/gmd" >

<!-- This adds the contact tag if it doesn't exist -->
  <xsl:template match="/gmd:MD_Metadata">
     <xsl:copy-of select="*"/>
     <xsl:message>
     Worked
     </xsl:message>
  </xsl:template>
  </xsl:stylesheet>''')

transform = etree.XSLT(doc)

result = transform(xslt_tree)
print transform.error_log
print (etree.tostring(result,pretty_print=True))

This outputs

<language xmlns="http://www.isotc211.org/2005/gmd" xmlns:gco="http://www.isotc211.org/2005/gco"/>

when surely it should output

<MD_Metadata xmlns="http://www.isotc211.org/2005/gmd" xmlns:gco="http://www.isotc211.org/2005/gco">
  <language/>
  <characterSet/>
  </MD_Metadata>

Any ideas why?


OLD QUESTION

I have an xml file like this:

    <?xml version="1.0" encoding="UTF-8"?>
<MD_Metadata xmlns="http://www.isotc211.org/2005/gmd">
  <language>
  <LanguageCode codeList="http://www.loc.gov/standards/iso639-2/php/code_list.php" codeListValue="eng" codeSpace="ISO639-2">eng</LanguageCode>
  </language>
  <characterSet>
   <MD_CharacterSetCode codeList="http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#MD_CharacterSetCode" codeListValue="utf8" codeSpace="ISOTC211/19115">utf8</MD_CharacterSetCode>
  </characterSet>
 .... etc
 </MD_Metadata>

and an xlt file as follows:

 <xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<!-- Show all elements -->
<xsl:template match="@*|node()">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>

<!-- This adds the contact tag if it doesn't exist -->
  <xsl:template match="/gmd:MD_Metadata">
     <xsl:copy-of select="@*|node()">
          <xsl:if test="not(/gmd:MD_Metadata/gmd:contact)">
      <xsl:element name="contact" namespace="http://www.isotc211.org/2005/gmd">
            </xsl:element>
          </xsl:if>
     </xsl:copy-of>
  </xsl:template>

  </xsl:stylesheet>

When I run it in lxml in Python, I get the MD_Metadata element and the first child, returned. However, when I run this in Eclipse WTP (Eclipse XSL Tools) using either the default Java processor or Xalan, I get all elements returned from the MD_Metadata tag, including characterSet and elements afterwards. For me, the latter was the expected behaviour due to the tag. I can't see anything I am doing in calling the transform in Python, but just in case:

xslt_root = lxml.etree.parse("XSLFile")
transform = lxml.etree.XSLT(xslt_root)
result_tree = transform(doc)
print (etree.tostring(result_tree,pretty_print=True))

Is there a substantial difference between the two processors I am using or is there another explanation?

Was it helpful?

Solution

The reason you are getting odd behaviour is that xsl:copy-of should be an empty element. I can only presume that some engines are "helpfully" trying to interpret the xsl:if in some undefined way that is causing the trouble.

Remove the elements causing the undefined behaviour and it should be consistant across the different engines again.

OTHER TIPS

Ah, it was the XPath.

I think I should have used <xsl:copy-of select="self::*"/>. I thought * selected all the current node as well as its children.

Thanks for help

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top