Question

I realize many similar questions have been asked, but I have read about a dozen great examples and still been unable to combine them all into a working solution.

I have an RSS feed with the following structure:

<root>
    <pubDate>Tue, 03 Sep 2013 15:15:00 +0000</pubDate>
    <title>Title 1</title>
    <pubDate>Tue, 02 Mar 2013 15:15:00 +0000</pubDate>
    <title>Title 2</title>
    <pubDate>Tue, 02 Sep 2012 15:15:00 +0000</pubDate>
    <title>Title 3</title>
    ...
</root>

It is flattened out because the feed has some very large nodes that slow the page down enormously, if it loads at all. Therefore, when I pull it in, I limit the data chosen to just the title and pubDate fields, which removes the hierarchy. (Other suggestions here maybe?)

I want to display the data grouped by year:

<year handle="2013">
    <date>Tue, 03 Sep 2013 15:15:00 +0000</date>
    <title>Title 1</title>
    <date>Tue, 02 Mar 2013 15:15:00 +0000</date>
    <title>Title 2</title>
</year>
<year handle="2012">
    <date>Tue, 02 Sep 2012 15:15:00 +0000</date>
    <title>Title 3</title>
    ...
</year>
...

I can parse out the year with substring-before(substring-after(substring-after(substring-after(pubDate, ' '), ' '), ' '), ' '), and I have attempted to create a key with:

<xsl:key name="years" match="/root/pubDate" use="substring-before(substring-after(substring-after(substring-after(/root/pubDate, ' '), ' '), ' '), ' ')" />

Which I then used:

<xsl:template match="root">
<rss>
    <xsl:apply-templates mode="year" select="pubDate[
      generate-id()
      =
      generate-id(key('years', substring-before(substring-after(substring-after(substring-after(/root/pubDate, ' '), ' '), ' '), ' '))[1])
    ]"/>
</rss>
</xsl:template>

<xsl:template match="root/pubDate" mode="year">
    <xsl:variable name="year" select="substring-before(substring-after(substring-after(substring-after(., ' '), ' '), ' '), ' ')"/>
    <year handle="{$year}">
         <xsl:apply-templates mode="final" select="key('years', $year)"/>
    </year>
</xsl:template>

<xsl:template match="root/pubDate" mode="final">
    <date>
        <xsl:value-of select="." />
    </date> 
    <title><xsl:value-of select="./following-sibling::*[1]" /></title>
</xsl:template>

But my output is:

<year handle="2013">
    <date>Tue, 03 Sep 2013 15:15:00 +0000</date>
    <title>Title 1</title>
    <date>Tue, 02 Mar 2013 15:15:00 +0000</date>
    <title>Title 2</title>
    <date>Tue, 02 Sep 2012 15:15:00 +0000</date>
    <title>Title 3</title>
    ...
</year>

I seem to be able to solve individual parts of this problem but I have not been able to get them all working together. There are many examples of grouping RSS by year and month, but that is just different enough from my problem (especially considering my flattened RSS) that I cannot emulate those examples.

If it helps to know, I am using XSLT / XPath 1.0 because I am using Symphony CMS, which relies on PHP's libxslt.

Any help is very welcome, thanks for reading.

Was it helpful?

Solution 2

There are two XPath mistakes in your code, using /root/pubDate instead of the (correct) current node. This fixed stylesheet provides the desired output:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:key name="years" match="/root/pubDate" use="substring-before(substring-after(substring-after(substring-after(., ' '), ' '), ' '), ' ')" />

  <xsl:template match="root">
    <rss>
      <xsl:apply-templates mode="year" select="pubDate[
        generate-id()
        =
        generate-id(key('years', substring-before(substring-after(substring-after(substring-after(., ' '), ' '), ' '), ' '))[1])
        ]"/>
    </rss>
  </xsl:template>

  <xsl:template match="root/pubDate" mode="year">
    <xsl:variable name="year" select="substring-before(substring-after(substring-after(substring-after(., ' '), ' '), ' '), ' ')"/>
    <year handle="{$year}">
      <xsl:apply-templates mode="final" select="key('years', $year)"/>
    </year>
  </xsl:template>

  <xsl:template match="root/pubDate" mode="final">
    <date>
      <xsl:value-of select="." />
    </date>
    <title><xsl:value-of select="./following-sibling::*[1]" /></title>
  </xsl:template>

</xsl:stylesheet>

There are shorter solutions as well. IMHO using different templates is no win in this case, since it doesn't make the code more readable (and re-usability of templates is not required). In addition, I would solve details differently. You might consider the following code, or maybe parts of it:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:key name="pubdates-by-year" match="/root/pubDate" use="substring-before(substring-after(substring-after(substring-after(., ' '), ' '), ' '), ' ')" />

  <xsl:template match="root">
    <rss>
      <xsl:for-each select="pubDate[count(. | key('pubdates-by-year', substring-before(substring-after(substring-after(substring-after(., ' '), ' '), ' '), ' '))[1]) = 1]">
        <xsl:variable name="year" select="substring-before(substring-after(substring-after(substring-after(., ' '), ' '), ' '), ' ')"/>
        <year handle="{$year}">
          <xsl:for-each select="key('pubdates-by-year', $year)">
            <xsl:copy-of select="."/>
            <xsl:copy-of select="following-sibling::title[1]"/>
          </xsl:for-each>
        </year>
      </xsl:for-each>
    </rss>
  </xsl:template>

</xsl:stylesheet>

OTHER TIPS

Kind of a limit case for XSLT where you would be way more efficient with some real code.

The first thing I would try is to limit the feed returned by the source, maybe your RSS query provider supports any argument for limiting the content (by date, ID).

Otherwise, if the loading of the full RSS is slow, I don't think it would be any faster to transform the data first.

If your concern is the page load time, you might delay-load the RSS using async AJAX.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top