Question

I want to be able to show all of the elements that become between other elements with certain values. Eg

<wd>abc</wd>
<wd>123</wd>
<wd>456</wd>
<wd>789</wd>
<wd>def</wd>

I want the code to look for all words after abc and before def, and display them.

What I tried so far is (the namespace is ss)

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:ss="http://www.scansoft.com/omnipage/xml/ssdoc-schema3.xsd">
<xsl:output method="text"/>

    <xsl:template match="/">



    <!-- Variable declarations -->

    <xsl:variable name="wds" select="//ss:wd"/>

    <!-- Variable declarations end-->

        <xsl:if test="preceding::ss:wd[contains(.,'7BB')">
            <xsl:if test="following::ss:wd[contains(.,SHIPMENT)">
                <xsl:for-each select="$wds"/>
                <xsl:value-of select="$wds"/>
            </xsl:if>
        </xsl:if>

    </xsl:template>
</xsl:stylesheet>

But this doesn't work at all.
How do I get around this?

Update: In response to Michael

Unless I'm overlooking something, your code should be able to be copied + pasted into mine. However, when I do this, the XSLT executes, but no data is returned.

This is what I have:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ss="http://www.scansoft.com/omnipage/xml/ssdoc-schema3.xsd">

<xsl:template match="/">



<!-- Variable declarations -->

<xsl:variable name="wds" select="//ss:wd"/>

<!-- Variable declarations end-->

        <xsl:for-each select="ss:document/ss:wd[preceding-sibling::ss:wd[.='7BB'] and following-sibling::ss:wd[.='SHIPMENT']]">
            <xsl:value-of select="." />
            <xsl:if test="position()!=last()">
                <xsl:text>/</xsl:text>  
            </xsl:if>
        </xsl:for-each>   


        Net Amount <xsl:value-of select="$wds[4]"/>
        <xsl:text>&#10;</xsl:text>  
        VAT Amount <xsl:value-of select="$wds[8]"/>
        <xsl:text>&#10;</xsl:text>  
        Total <xsl:value-of select="$wds[12]"/>


</xsl:template>

This is my entire XSLT so far, I've also attached my source: dropbox

Was it helpful?

Solution 2

Here's a very simple method:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ss="http://www.scansoft.com/omnipage/xml/ssdoc-schema3.xsd">

<xsl:output method="text" encoding="UTF-8"/>

<xsl:template match="/">
    <xsl:for-each select="ss:root/ss:wd[preceding-sibling::ss:wd[.='abc'] and following-sibling::ss:wd[.='def']]">
        <xsl:value-of select="." />
        <xsl:if test="position()!=last()">
            <xsl:text>/</xsl:text>  
        </xsl:if>
    </xsl:for-each>           
</xsl:template>

</xsl:stylesheet>

Note that it is assumed that the input here is in the form of:

<root xmlns="http://www.scansoft.com/omnipage/xml/ssdoc-schema3.xsd">
    <wd>001</wd>
    <wd>002</wd>
    <wd>abc</wd>
    <wd>123</wd>
    <wd>456</wd>
    <wd>789</wd>
    <wd>def</wd>
    <wd>998</wd>
    <wd>999</wd>
</root>

Given this input, the result of applying the above transformation is:

123/456/789

You did not provide your (full) input or required output, so you will need to make the necessary adjustments.

IMPORTANT: We are also assuming that there is only one occurrence of <wd>abc</wd> and <wd>def</wd> in the entire node-set. Otherwise it won't be that simple.

--

A note about performance: it is difficult to predict performance without testing it on the actual processor you will be using. In general, explicit code is faster than implied one: it's better to say ss:root/ss:wd than //ss:wd, and ss:wd is preferable to *.


Edit:

The structure of the document you have linked to is significantly different from the example in your question. Specifically, the <wd l="1675" t="4243" r="1939" b="4358">7BB</wd> node has no following siblings, since it is the last child of its <ln> parent. Note also that a <wd> with a value of SHIPMENT appears twice.

Nevertheless, I ran the following test stylesheet against it:

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ss="http://www.scansoft.com/omnipage/xml/ssdoc-schema3.xsd"
exclude-result-prefixes="ss">

<xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>
<xsl:template match="/">
<test>
    <xsl:for-each select="//ss:wd[preceding::ss:wd[.='7BB'] and following::ss:wd[.='SHIPMENT']]">
        <wd>
            <xsl:value-of select="." />
        </wd>
     </xsl:for-each>   
</test>
</xsl:template>

</xsl:stylesheet>

and obtained the following result:

<?xml version="1.0" encoding="utf-8"?>
<test>
   <wd>National</wd>
   <wd>Distribution</wd>
   <wd>Centre</wd>
   <wd>Kelway</wd>
   <wd>Ltd</wd>
   <wd>Unit</wd>
   <wd>19,</wd>
   <wd>Glebe</wd>
   <wd>Farm</wd>
   <wd>Road</wd>
   <wd>Glebe</wd>
   <wd>Farm</wd>
   <wd>Industrial</wd>
   <wd>Estate</wd>
   <wd>Rugby</wd>
   <wd>CV21</wd>
   <wd>1GQ</wd>
</test>

Hopefully, that's something you can work with.

OTHER TIPS

Here is a naive and poorly performing method for achieving your goal in XSLT 1.0:

/*/*[.='abc'][1]/following-sibling::*[
    not(.='def' or preceding-sibling::*[.='def'])]

In English:

Retrieve all siblings following the first element containing abc that are not themselves an element containing def and that do not have a previous sibling that contains def (i.e. elements that don't appear after an element containing def).

Some people will tell you that you should never do this. I think they're wrong. There are plenty of situations (especially on small data sets) where this is the simplest and most obvious solution. There are other situations (especially on large data sets) where this method will break down.

A better technique for retrieving the intersection of two node sets (especially on large data sets) is the Kayessian method. It looks like this:

$ns1[count(.|$ns2)=count($ns2)]

In English (informally):

Take all the nodes from $ns1 such that adding that node to $ns2 does not increase its size

More technically, if the set created by the union of a node a and a set $ns2 has the same number of elements as $ns2, then a must already be in that set. We want every element from $ns1 for which this is true.

In our case, we want the intersection of the sets 1) every sibling after the first node containing abc and 2) every sibling before the first node containing def. It looks something like this (depending on the structure of your input):

/*/*[.='abc'][1]/following-sibling::*[
    count(.| /*/*[.='def'][1]/preceding-sibling::*)=
    count(/*/*[.='def'][1]/preceding-sibling::*)]

Here's a full example:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes" />
    <xsl:strip-space elements="*" />
    <xsl:variable name="ns1" select="/*/*[.='abc'][1]/following-sibling::*" />
    <xsl:variable name="ns2" select="/*/*[.='def'][1]/preceding-sibling::*" />
    <xsl:template match="/">
        <xsl:copy-of select="$ns1[count(.|$ns2)=count($ns2)]" />
    </xsl:template>
</xsl:stylesheet>

On this input:

<root>
    <wd>abc</wd>
    <wd>123</wd>
    <wd>456</wd>
    <wd>789</wd>
    <wd>def</wd>
</root>

You get this output:

<wd>123</wd>
<wd>456</wd>
<wd>789</wd>

XSLT 2.0 has the operators << and >> (<< needs to be written as &lt;&lt; in XSLT stylesheets) that could help as they check for document order so //ss:wd[. >> //ss:wd[. = 'abc'] and . &lt;&lt; //ss:wd[. = 'def']] might do. And there is for-each-group group-starting-with/group-ending-with that could help as well.

The XML you linked to does not have wd siblings, the elements are at some deeper levels. You might want to try whether

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ss="http://www.scansoft.com/omnipage/xml/ssdoc-schema3.xsd">

<xsl:template match="/">



<!-- Variable declarations -->

<xsl:variable name="wds" select="//ss:wd"/>

<!-- Variable declarations end-->

        <xsl:for-each select="$wds[preceding::ss:wd[.='7BB'] and following::ss:wd[.='SHIPMENT']]">
            <xsl:value-of select="." />
            <xsl:if test="position()!=last()">
                <xsl:text>/</xsl:text>  
            </xsl:if>
        </xsl:for-each>   


        Net Amount <xsl:value-of select="$wds[4]"/>
        <xsl:text>&#10;</xsl:text>  
        VAT Amount <xsl:value-of select="$wds[8]"/>
        <xsl:text>&#10;</xsl:text>  
        Total <xsl:value-of select="$wds[12]"/>


</xsl:template>

</xsl:stylesheet>

gives you the result you want, for me it outputs

National/Distribution/Centre/Kelway/Ltd/Unit/19,/Glebe/Farm/Road/Glebe/Farm/Industrial/Estate/Rugby/CV21/1GQ


        Net Amount 145.31

        VAT Amount 29.06

        Total 174.37
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top