Produce context data for first and last occurrences of every value of an element

StackOverflow https://stackoverflow.com/questions/807578

  •  03-07-2019
  •  | 
  •  

Question

Given the following xml:

<container>
    <val>2</val>
    <id>1</id>
</container>
<container>
    <val>2</val>
    <id>2</id>
</container>
<container>
    <val>2</val>
    <id>3</id>
</container>
<container>
    <val>4</val>
    <id>1</id>
</container>
<container>
    <val>4</val>
    <id>2</id>
</container>
<container>
    <val>4</val>
    <id>3</id>
</container>

I'd like to return something like

2 - 1
2 - 3
4 - 1
4 - 3

Using a nodeset I've been able to get the last occurrence via:

exsl:node-set($list)/container[not(val = following::val)]

but I can't figure out how to get the first one.

Was it helpful?

Solution

To get the first and the last occurrence (document order) in each "<val>" group, you can use an <xsl:key> like this:

<xsl:stylesheet 
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
  <xsl:output method="text" />

  <xsl:key name="ContainerGroupByVal" match="container" use="val" />

  <xsl:variable name="ContainerGroupFirstLast" select="//container[
    generate-id() = generate-id(key('ContainerGroupByVal', val)[1])
    or
    generate-id() = generate-id(key('ContainerGroupByVal', val)[last()])
  ]" />

  <xsl:template match="/">
    <xsl:for-each select="$ContainerGroupFirstLast">
      <xsl:value-of select="val" />
      <xsl:text> - </xsl:text>
      <xsl:value-of select="id" />
      <xsl:value-of select="'&#10;'" /><!-- LF -->
    </xsl:for-each>
  </xsl:template>

</xsl:stylesheet>

EDIT #1: A bit of an explanation since this might not be obvious right away:

  • The <xsl:key> returns all <container> nodes having a given <val>. You use the key() function to query it.
  • The <xsl:variable> is where it all happens. It reads as:
    • for each of the <container> nodes in the document ("//container") check…
    • …if it has the same unique id (generate-id()) as the first node returned by key() or the last node returned by key()
    • where key('ContainerGroupByVal', val) returns the set of <container> nodes matching the current <val>
    • if the unique ids match, include the node in the selection
  • the <xsl:for-each> does the output. It could just as well be a <xsl:apply-templates>.

EDIT #2: As Dimitre Novatchev rightfully points out in the comments, you should be wary of using the "//" XPath shorthand. If you can avoid it, by all means, do so — partly because it potentially selects nodes you don't want, and mainly because it is slower than a more specific XPath expression. For example, if your document looks like:

<containers>
  <container><!-- ... --></container>
  <container><!-- ... --></container>
  <container><!-- ... --></container>
</containers>

then you should use "/containers/container" or "/*/container" instead of "//container".


EDIT #3: An alternative syntax of the above would be:

<xsl:stylesheet 
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
  <xsl:output method="text" />

  <xsl:key name="ContainerGroupByVal" match="container" use="val" />

  <xsl:variable name="ContainerGroupFirstLast" select="//container[
    count(
        .
      | key('ContainerGroupByVal', val)[1]
      | key('ContainerGroupByVal', val)[last()]
    ) = 2
  ]" />

  <xsl:template match="/">
    <xsl:for-each select="$ContainerGroupFirstLast">
      <xsl:value-of select="val" />
      <xsl:text> - </xsl:text>
      <xsl:value-of select="id" />
      <xsl:value-of select="'&#10;'" /><!-- LF -->
    </xsl:for-each>
  </xsl:template>

</xsl:stylesheet>

Explanation: The XPath union operator "|" combines it's arguments into a node-set. By definition, a node-set cannot contain duplicate nodes — for example: ". | . | ." will create a node-set containing exactly one node (the current node).

This means, if we create a union node-set from the current node ("."), the "key(…)[1]" node and the "key(…)[last()]" node, it's node count will be 2 if (and only if) the current node equals one of the two other nodes, in all other cases the count will be 3.

OTHER TIPS

Basic XPath:

//container[position() = 1]  <- this is the first one
//container[position() = last()]  <- this is the last one

Here's a set of XPath functions in more detail.

I. XSLT 1.0

Basically the same solution as the one by Tomalak, but more understandable Also it is complete, so you only need to copy and paste the XML document and the transformation and then just press the "Transform" button of your favourite XSLT IDE:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>

    <xsl:key name="kContByVal" match="container"
     use="val"/>

    <xsl:template match="/*">
      <xsl:for-each select=
       "container[generate-id()
                 =
                  generate-id(key('kContByVal',val)[1])
                 ]
       ">

       <xsl:variable name="vthisvalGroup"
        select="key('kContByVal', val)"/>

       <xsl:value-of select=
        "concat($vthisvalGroup[1]/val,
              '-',
              $vthisvalGroup[1]/id,
              '&#xA;'
              )
      "/>
       <xsl:value-of select=
        "concat($vthisvalGroup[last()]/val,
              '-',
              $vthisvalGroup[last()]/id,
              '&#xA;'
              )
        "/>
      </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the originally-provided XML document (edited to be well-formed):

<t>
    <container>
        <val>2</val>
        <id>1</id>
    </container>
    <container>
        <val>2</val>
        <id>2</id>
    </container>
    <container>
        <val>2</val>
        <id>3</id>
    </container>
    <container>
        <val>4</val>
        <id>1</id>
    </container>
    <container>
        <val>4</val>
        <id>2</id>
    </container>
    <container>
        <val>4</val>
        <id>3</id>
    </container>
</t>

the wanted result is produced:

2-1
2-3
4-1
4-3

Do note:

  1. We use the Muenchian method for grouping to find one container element for each set of such elements that have the same value for val.
  2. From the whole node-list of container elements with the same val value, we output the required data for the first container element in the group and for the last container element in the group.

II. XSLT 2.0

This transformation:

<xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xsl:output method="text"/>

    <xsl:template match="/*">
      <xsl:for-each-group select="container"
           group-by="val">
        <xsl:for-each select="current-group()[1], current-group()[last()]">
          <xsl:value-of select=
           "concat(val, '-', id, '&#xA;')"/>
        </xsl:for-each>
    </xsl:for-each-group>
    </xsl:template>
</xsl:stylesheet>

when applied on the same XML document as above, prodices the wanted result:

2-1
2-3
4-1
4-3

Do note:

  1. The use of the <xsl:for-each-group> XSLT instruction.

  2. The use of the current-group() function.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top