Question

I have an XML document in the following format:

<Contents>
  <Content Name="ClientXML">
    <EntityData>
        <Data Name="EQ_EligibleForGuaranteedIssue">Yes</Data>
        <Data Name="ABRInd">NO</Data>
        <Data Name="AC_AgentNo">12345</Data>
        <Data Name="AC_AgentPersonallyMetWithApplicant">Has</Data>
        <Data Name="AC_City">Pomona</Data>
        <Data Name="AC_FirstName">Kimmy</Data>
        <Data Name="AC_FullName">Kimmy N Jackson</Data>
        <Data Name="AC_Initials">K J</Data>
        <Data Name="AC_LastAndSuf">Jackson</Data>
        ...
    </EntityData>
  </Content>
  <Content Name="UserXML">
    <EntityData>
        <Data Name="TransRefGUID">789-456-123456789-456</Data>
        ...
    </EntityData>
  </Content>
</Contents>

Other information:

  1. There can be several thousand 'Data' nodes under each 'EntityData' object
  2. The value of any 'Name' attribute is never duplicated.

I have to create an XSL transform and am using the xsl:value-of select="..." function. My question is, what XPath expression is going to execute the fastest? For example

<xsl:value-of select="\\Contents\Content[@Name="ClientXML"\EntityData\Data[@Name=".."]">

or simply

<xsl:value-of select="\\Data[@Name=".."]">

I don't have access to the end server which will eventually run this process, and locally the second option may appear to be a little faster.

Wondering if anyone has an opinion, and on a much larger scale if one may be faster.

Thanks!

Was it helpful?

Solution

Using keys in XSLT will be far faster than an XPath expression, especially one with // which can be very slow to execute and should only be used when necessary.

<xsl:key match="Content" use="@Name" name="MyContentsLookup"/>
...
<xsl:value-of select="key('MyContentsLookup','ClientXML')"/>

An XSLT processor can implement internal search mechanisms to quickly look up a value in tens of thousands of entries, far faster than with XPath.

I've published an overview of XSLT keys here: http://www.CraneSoftwrights.com/resources/xslkeys/index.htm

OTHER TIPS

When you say the contents of Name are never duplicated, is that true across the document as a whole, or only within each Content element? If it's true globally, then Ken's technique using keys is ideal. If it's only true locally, you might want to consider setting up a key that combines Content/@Name with EntityData/@Name.

The other thing to bear in mind is that performance depends on your processor. Implementors have a great deal of freedom to optimize the same expression in different ways. Even within the same product family, Saxon-EE will execute the expression //Data[@Name='abc'] very differently from the way Saxon-HE implements it (in effect, Saxon-EE creates keys automatically where needed, rather than requiring you to create them by hand). So you can't ask performance questions except in relation to a specific implementation.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top