Question

I'm fishing for approaches to a problem with XSLT processing.

Is it possible to use parallel processing to speed up an XSLT processor? Or are XSLT processors inherently serial?

My hunch is that XML can be partitioned into chunks which could be processed by different threads, but since I'm not really finding any documentation of such a feat, I'm getting skeptical. It possible to use StAX to concurrently chunk XML?

It seems that most XSLT processors are implemented in Java or C/C++, but I really don't have a target language. I just want to know if a multi-threaded XSLT processor is conceivable.

What are your thoughts?

Was it helpful?

Solution

Like most programming languages looping is inherently parallelizable as long as you follow a couple rules, this is known as Data Parallelism

  • No mutation of shared state in the loop
  • One iteration of the loop cannot depend on the outcome of another iteration

Any looping constructs could be parallelized in XSLT fairly easily.

With similar rules against mutation and dependencies you really could parallelize most of an XSLT transformation in a kind of a task based parallelism.

First, fragment the document whole into tasks, segmented at XSLT command and text node boundaries; each task should be assigned a sequential index according to it's position in the document (top to bottom).

Next, scatter the tasks to distinct XSLT processing functions each running on different threads; these processors will all need to be initialized with the same global state (variables, constants, etc...).

Finally, once all the transformations are complete, the controlling thread should gather the results (transformed strings) in index order and assemble them into the finished document.

OTHER TIPS

Saxon: Anatomy of an XSLT Processor, excellent article about XSLT processors, saxon in particular. It covers multithreading.

Saxon by the way is available both for .NET and Java and is one of the best processors available.

A late answer, for people who hit this thread as a result of a search. At the time this question was asked, multithreading in XSLT was a theoretical possibility but wasn't actually realised in any production XSLT processors. Today multithreading is available "out-of-the-box" in Saxon-EE. A paper describing how this works was published at XML Prague 2015: see http://www.saxonica.com/papers/xmlprague-2015mhk.pdf

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top