Question

In this question i asked how to perform a conditional increment. The provided answer worked, but does not scale well on huge data-sets.

The Input:

<Users>
    <User>
        <id>1</id>
        <username>jack</username>
    </User>
    <User>
        <id>2</id>
        <username>bob</username>
    </User>
    <User>
        <id>3</id>
        <username>bob</username>
    </User>
    <User>
        <id>4</id>
        <username>jack</username>
    </User>
</Users>

The desired output (in optimal time-complexity):

<Users>
   <User>
      <id>1</id>
      <username>jack01</username>
   </User>
   <User>
      <id>2</id>
      <username>bob01</username>
   </User>
   <User>
      <id>3</id>
      <username>bob02</username>
   </User>
   <User>
      <id>4</id>
      <username>jack02</username>
   </User>
</Users>

For this purpose it would be nice to

  • sort input by username
  • for each user
    • when previous username is equals current username
      • increment counter and
      • set username to '$username$counter'
    • otherwise
      • set counter to 1
  • (sort by id again - no requirement)

Any thoughts?

Was it helpful?

Solution 2

This transformation produces exactly the specified wanted result and is efficient (O(N)):

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:ext="http://exslt.org/common" exclude-result-prefixes="ext">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kUserByName" match="User" use="username"/>
 <xsl:key name="kUByGid" match="u" use="@gid"/>

 <xsl:variable name="vOrderedByName">
  <xsl:for-each select=
  "/*/User[generate-id()=generate-id(key('kUserByName',username)[1])]">
     <xsl:for-each select="key('kUserByName',username)">
       <u gid="{generate-id()}" pos="{position()}"/>
     </xsl:for-each>
  </xsl:for-each>
 </xsl:variable>

  <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="username/text()">
     <xsl:value-of select="."/>
     <xsl:variable name="vGid" select="generate-id(../..)"/>

     <xsl:for-each select="ext:node-set($vOrderedByName)[1]">
      <xsl:value-of select="format-number(key('kUByGid', $vGid)/@pos, '00')"/>
     </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>

When applied on the provided XML document:

<Users>
    <User>
        <id>1</id>
        <username>jack</username>
    </User>
    <User>
        <id>2</id>
        <username>bob</username>
    </User>
    <User>
        <id>3</id>
        <username>bob</username>
    </User>
    <User>
        <id>4</id>
        <username>jack</username>
    </User>
</Users>

the wanted, correct result is produced:

<Users>
   <User>
      <id>1</id>
      <username>jack01</username>
   </User>
   <User>
      <id>2</id>
      <username>bob01</username>
   </User>
   <User>
      <id>3</id>
      <username>bob02</username>
   </User>
   <User>
      <id>4</id>
      <username>jack02</username>
   </User>
</Users>

OTHER TIPS

This is kind of ugly and I'm not fond of using xsl:for-each, but it should be faster than using preceding-siblings, and doesn't need a 2-pass approach:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
  <xsl:key name="count" match="User" use="username" />

  <xsl:template match="Users">
    <Users>
      <xsl:for-each select="User[generate-id()=generate-id(key('count',username)[1])]">
        <xsl:for-each select="key('count',username)">
          <User>
            <xsl:copy-of select="id" />
            <username>
              <xsl:value-of select="username" />
              <xsl:number value="position()" format="01"/>
            </username>
          </User>
        </xsl:for-each>
      </xsl:for-each>
    </Users>
  </xsl:template>
</xsl:stylesheet>

If you really need it sorted by ID afterwards, you can wrap it into a two-pass template:

<xsl:stylesheet version="1.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   xmlns:msxsl="urn:schemas-microsoft-com:xslt">
  <xsl:key name="count" match="User" use="username" />

  <xsl:template match="Users">
    <xsl:variable name="pass1">
      <xsl:for-each select="User[generate-id()=generate-id(key('count',username)[1])]">
        <xsl:for-each select="key('count',username)">
          <User>
            <xsl:copy-of select="id" />
            <username>
              <xsl:value-of select="username" />
              <xsl:number value="position()" format="01"/>
            </username>
          </User>
        </xsl:for-each>
      </xsl:for-each>
    </xsl:variable>

    <xsl:variable name="pass1Nodes" select="msxsl:node-set($pass1)" />

    <Users>
      <xsl:for-each select="$pass1Nodes/*">
        <xsl:sort select="id" />
        <xsl:copy-of select="." />
      </xsl:for-each>
    </Users>
  </xsl:template>
</xsl:stylesheet>

Here's a slight variation, but possible not a great increase in efficiency

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
   <xsl:output method="xml" indent="yes"/>
   <xsl:key name="User" match="User" use="username" />

   <xsl:template match="username/text()">
      <xsl:value-of select="." />
      <xsl:variable name="id" select="generate-id(..)" />
      <xsl:for-each select="key('User', .)">
         <xsl:if test="generate-id(username) = $id">
            <xsl:number value="position()" format="01"/>
         </xsl:if>
      </xsl:for-each>
   </xsl:template>

   <xsl:template match="@*|node()">
      <xsl:copy>
         <xsl:apply-templates select="@*|node()"/>
      </xsl:copy>
   </xsl:template>
</xsl:stylesheet>

What this is doing is defining a key to group Users by username. Then, for each username element, you look through the elements in the key for that username, and output the position when you find a match.

One slight advantage of this approach is that you are only looking at user records with the same name. This may be more efficient if you don't have huge numbers of the same name.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top