Saxon XSLT Transformation from Command Line for directory of files with shared reference document

StackOverflow https://stackoverflow.com/questions/22598366

  •  19-06-2023
  •  | 
  •  

Question

I am running saxon 9HE from the command line to convert a directory of xml files. The xslt loads a couple of documents as shared reference documents to look up common information - ie. one is a company listing to verify the company and to return company information to display on the the resulting html page and another is a part list to verify the part and return part information to display on the page. These files can have multiple versions in the xml directory, so I get a collection of them and take the last one which is the most recent one.

What I am wondering is during the command line transformation, are these shared documents loaded/cached into memory for all input xml files or are they reloaded for every input xml file in the directory being processed?

sample xml

<companies>
    <company code="123"/>
    <address>
    <street>1 MAIN STREET</street>
    <city>City</city>
    <state>ST</state>
    <country>USA</country>
    <phoneNumber>800-123-4567</phoneNumber>
    </address>
    </company>
</companies>

sample xslt

<?xml version="1.0"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
           xmlns="http://www.w3.org/1999/xhtml">
   <xsl:output method="xhtml" xpath-default-namespace="http://www.w3.org/1999/xhtml" 
           doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
           doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN" 
           indent="yes" 
           omit-xml-declaration="yes"/>

<xsl:param name="currentDir">false</xsl:param>
<xsl:variable name="companyCollection" select="collection(iri-to-uri(concat  ('file:///',$currentDir, '/xml/', '?select=company_, '*.(xml|XML)')))[last()]"/> 
<xsl:variable name="companyDoc" select="$companyCollection//companies"/>  
<xsl:key name="companyKey" match="company" use="company/@mcode"/> 

If they are loaded for each individual transformation, how would I get them to be loaded only once while processing the directory of xml files? If not possible from the command line, is it possible from java?

Or would it be best/fastest to have the file name ahead of time and hard code it in the document() command to prevent from loading for each input xml file?

<xsl:variable name="companyDoc" select="document(company.xml)/companies"/>

We are processing thousands of xml files in the directory and would like to make it as efficient as can be.

Thank you!

Était-ce utile?

La solution

When you process a directory from the command line, each input file is processed using a new Tranformer, so there is no caching of source documents.

If the lookup files were known statically (doc('lookup.xml')) then you could force compile-time loading of the document by adding the option --preEvaluateDocFunction:on. Since the stylesheet is only compiled once, this would have the effect of only loading the lookup document once.

Generally you will have a lot more control over the execution if you run the job from a Java application (using s9api) rather than from the command line.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top