Question

I'm trying to split out the comma separated tags list below into individual elements. The element and attribute names in the XML source will always be the same. I'm using 1.0, so I was hoping for a 1.0 solution. Based on this similar example I thought the following XSL could work:

<xsl:stylesheet  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output indent="yes"/>
      <xsl:template match="pbcoreCollection">
        <pbcoreCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
             <pbcoreDescriptionDocument>
                 <xsl:call-template name="tokenize">
                 <xsl:with-param name="text" select="instantiationannotation"/>
                 <xsl:with-param name="elemName" select="'instantitionAnnotation'"/>
                </xsl:call-template>
            </pbcoreDescriptionDocument>
        </pbcoreCollection>
    </xsl:template> 
    <xsl:template name="tokenize">
        <xsl:param name="text"/>
        <xsl:param name="elemName"/>
        <xsl:param name="sep" select="', '"/>
        <xsl:choose>
            <xsl:when test="contains($text, $sep)">
                <xsl:element name="{$elemName}">
                    <xsl:value-of select="substring-before($text, $sep)"/>
                </xsl:element>
                <!-- recursive call -->
                <xsl:call-template name="tokenize">
                    <xsl:with-param name="text" select="substring-after($text, $sep)" />
                    <xsl:with-param name="elemName" select="$elemName" />
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:element name="{$elemName}">
                    <xsl:value-of select="$text"/>
                </xsl:element>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
</xsl:stylesheet>

But it yields the result

<?xml version="1.0"?>
<pbcoreCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <pbcoreDescriptionDocument>
    <instantitionAnnotation/>
  </pbcoreDescriptionDocument>
</pbcoreCollection>

My original XML:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pbcoreCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">  
   <pbcoreDescriptionDocument>
        <pbcoreInstantiation>
            <instantiationAnnotation annotationType="CMS tag">congress, guns, gun_policy, catholic_schools, social_media, </instantiationAnnotation>
        </pbcoreInstantiation>
   </pbcoreDescriptionDocument>
</pbcoreCollection>

That I would like to look like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pbcoreCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">  
   <pbcoreDescriptionDocument>
        <pbcoreInstantiation>
            <instantiationAnnotation annotationType="CMS tag">congress</instantiationAnnotation>
            <instantiationAnnotation annotationType="CMS tag">gun</instantiationAnnotation>
            <instantiationAnnotation annotationType="CMS tag">gun_policy</instantiationAnnotation>
            <instantiationAnnotation annotationType="CMS tag">catholic_schools</instantiationAnnotation>
            <instantiationAnnotation annotationType="CMS tag">social_media</instantiationAnnotation>
        </pbcoreInstantiation>
     </pbcoreDescriptionDocument>
</pbcoreCollection>
Was it helpful?

Solution

In this case, yes, the element and attribute names in the XML source will always be the same.

In such case, you could simplify to:

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="/">
    <pbcoreCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">  
       <pbcoreDescriptionDocument>
            <pbcoreInstantiation>
                <xsl:call-template name="tokenize">
                    <xsl:with-param name="text" select="pbcoreCollection/pbcoreDescriptionDocument/pbcoreInstantiation/instantiationAnnotation"/>
                </xsl:call-template>
            </pbcoreInstantiation>
       </pbcoreDescriptionDocument>
    </pbcoreCollection>
</xsl:template>

<xsl:template name="tokenize">
    <xsl:param name="text"/>
    <xsl:param name="sep" select="', '"/>
    <xsl:choose>
        <xsl:when test="contains($text, $sep)">
            <instantiationAnnotation annotationType="CMS tag">
                <xsl:value-of select="substring-before($text, $sep)"/>
            </instantiationAnnotation>
            <!-- recursive call -->
            <xsl:call-template name="tokenize">
                <xsl:with-param name="text" select="substring-after($text, $sep)" />
            </xsl:call-template>
        </xsl:when>
        <xsl:otherwise>
            <instantiationAnnotation annotationType="CMS tag">
                <xsl:value-of select="$text"/>
            </instantiationAnnotation>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

</xsl:stylesheet>

Note: this may need a bit more work if your input really carries a trailing ", " separator as shown in your example.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top