Trying to use a for each group group starting with statement to process sibling elements; but getting 'trailing' groups

https://stackoverflow.com/questions/22972212

xml
xslt-2.0

30-06-2023
|

سؤال

This is a simplified sample of my XML:

    <?xml version="1.0" encoding="UTF-8"?>
    <root> 
    <text>
    <inlineTag name="Story">
    <inlineTag name="_01_head">Headline </inlineTag>
    <inlineTag name="_03_deck">leadin content</inlineTag>
    <inlineTag name="_02_byline">Author One</inlineTag>
    <inlineTag name="_02_byline">Author Two </inlineTag>
    <inlineTag name="_04_body_1stpara">Lead in paragraph. lead in paragraph. lead in  paragraph.</inlineTag>
    <inlineTag name="_04_body">BodyCopyBodyCopy blah blah blah 
    <inlineTag name="_italic">Inline styles in body copy</inlineTag>.
BodyCopyBodyCopy blah blah blah. BodyCopyBodyCopyblahblah blah.
    </inlineTag>
    <inlineTag name="_01_head">Another Headline</inlineTag>
    <inlineTag name="_04_body">BodyCopyBodyCopyblahblah blahBodyCopyBodyCopyblahblah blahBodyCopyBodyCopyblahblah blah 
    <inlineTag name="_italic">Inline styles in body copy</inlineTag>]. 
 BodyCopyBodyCopyblahblah blahBodyCopyBodyCopyblahblah.
   </inlineTag>
   </inlineTag>
</text>
</root>

Each instance of should result in a different result doc; like this:

    <headline>Headline </headline>
    <deck>leadin content</deck>
    <bylines>
     <byline>Author One</byline>
    <byline>Author Two </byline>
    </bylines>
    <p lede='true'>Lead in paragraph. lead in paragraph. lead in paragraph.</p>
    <p>BodyCopyBodyCopy blah blah blah 
    <em style="italic">Inline styles in body copy</em>.
     BodyCopyBodyCopy blah blah blah. BodyCopyBodyCopyblahblah blah.
    </p>

another resultdoc:

    <headline>Another Headline </headline>
    <p>BodyCopyBodyCopy blah blah blah 
    <em style="italic">Inline styles in body copy</em>.
     BodyCopyBodyCopy blah blah blah. BodyCopyBodyCopyblahblah blah.
    </p>

ad nauseum for as many exists under text/inlineTag[@name='Story']...

I can get close to what I want when using something similar to:

   <xsl: for-each-group select="." group-starting-with="inlineTag[@name='_01_head']
   <xsl:for-each select="current-group()">
     <xsl:result-document href = "A unique naming sequence based on H1 count">
       <xsl:apply-templates select="."/> <!-- handles creation of the desired tagging -->
     </xsl:result-document>

BUT:

No matter how I seemed to apply the grouping, the 1st result document contains itself and ALL siblings, the second excludes the first and all siblings, the third excluded 1 and 2, and so forth ... OR I get each individual element in its own result document. In all instances, I was getting properly formed results in regards to the element namings, and distinct result document names (so, yeah, at least I got that going for me...).

Additionally, I cannot impose any other structure to the source xml such as a containing element:

   <Story>
     <seperate><inlineTag name="_01_head">...</seperate>
     <seperate><inlineTag name="_01_head">...</seperate>
  </Story>

So, the question:

Given the above example, how do I construct the for-each-group group-starting-with statement and subsequent processing so that I end up with a result document containing JUST the contents from inlineTag[@name='_01_head'] to only the NEXT inlineTag[@name='_01_head']; without 'capturing' the contents of the second inlineTag[@name='_01_head'] group?

And, thanks for reading this far, and thanks in advance for any guidance.

المحلول

It is difficult to understand what you do as the snippet for-each-group select="." looks very odd and you don't show the context.

Based on the XML you have and the description you want I think the following should help:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="2.0">

<xsl:template match="@* | node()">
  <xsl:copy>
    <xsl:apply-templates select="@* , node()"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="root/text/inlineTag[@name = 'Story']">
  <xsl:for-each-group select="*" group-starting-with="inlineTag[@name = '_01_head']">
    <xsl:result-document href="Story{position()}.xml">
      <root>
        <xsl:apply-templates select="current-group()"/>
      </root>
    </xsl:result-document>  
  </xsl:for-each-group>
</xsl:template>

<xsl:template match="inlineTag[@name = '_01_head']">
  <headline>
    <xsl:apply-templates/>
  </headline>
</xsl:template>

<xsl:template match="inlineTag[@name = '_03_deck']">
  <deck>
    <xsl:apply-templates/>
  </deck>
</xsl:template>

</xsl:stylesheet>

I think it shows the right approach for the grouping and the approach for transforming the other inline elements, you will need to add more templates for all the different kind of elements you have and you might need a nested grouping of adjacent byline need to be grouped. But let's first establish whether the above is the right direction, when applied to your sample it outputs two files Story1.xml and Story2.xml.

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow