
As far as I know recursion is very elegant but unefficient in OOP and procedural programming (see the wonderful "High Order perl", Mark Jason Dominus). I had some informations that in functional programming recursion is fast - keeping its elegance and simplicity. Could someone confirm and possibly amplify this? I am thinking in terms of XSLT and Haskell (high on my next-language-to-learn list)



Tail recursion is iteration in any decent functional language implementation. Here's an example using GHC Haskell. A simple program to add a sequence of numbers. It begins as the composition of several recursive functions:

import qualified Data.Vector as U

main = print (U.sum (U.enumFromTo 1 (10000000 :: Int)))

Which the compiler optimizes into a single tail recursive function (in a source-to-source transformation):

loop x y = case y <= y 10000000 of
      False -> x
      True  -> loop (x + y) (y + 1)

This recursive function is then compiled into a straight forward loop:

            cmpq $10000000,%rsi
            jle .Lc219
            movq %r14,%rbx
            movq (%rbp),%rax
            jmp *(%rax)
            addq %rsi,%r14
            incq %rsi
            jmp loop

Or with the GHC LLVM backend, additional optimizations are applied to the imperative representation of the program:

        leaq    1(%rsi), %rax
        addq    %rsi, %r14
        cmpq    $10000001, %rax
        jge     .LBB1_5
        addq    $2, %rsi
        addq    %rax, %r14
    test:                                # %tailrecurse
        cmpq    $10000001, %rsi
        jl      loop

Note how the tail recursive label is tagged.

So we had a pipeline of recursive functions, which were compiled to a single tail recursive function, which was compiled to a single imperative loop using no stack. And 8 instructions in the end.

And that is why both function composition, and recursion, are extremely efficient in good, optimizing function languages.

OOP/Procedural languages tend to place data on the stack each time a recursive call is made - thus recursion is not as efficient as iteration in these languages.

By contrast, compilers/interpreters for functional languages are typically designed to optimize tail recursion to be as efficient as iteration:

Recursion may require maintaining a stack, but tail recursion can be recognized and optimized by a compiler into the same code used to implement iteration in imperative languages. The Scheme programming language standard requires implementations to recognize and optimize tail recursion. Tail recursion optimization can be implemented by transforming the program into continuation passing style during compilation, among other approaches.

what-is-tail-call-optimization and which-languages-support-tail-recursion-optimization have more detailed information.

If the compiler in use supports the tail call optimization and you structure your code to take advantage of it, recursion isn't inefficient.

Due to the prevelance of recursion in functional programming, compilers for functional languages are more likely to implement the tail call optimization that procedural ones.

Efficient recursion in XSLT

There are two main ways to achieve efficient recursion in XSLT:

  1. Tail-recursion optimization
  2. Divide and Conquer (DVC)

There are a lot of answers covering tail recursion, so here's just a simple example:

  <xsl:function name="my:sum">
   <xsl:param name="pAccum" as="xs:double*"/>
   <xsl:param name="pNums" as="xs:double*"/>

   <xsl:sequence select=
        then $pAccum
           my:sum($pAccum + $pNums[1], $pNums[position() >1])

One can check that my:sum(0, 1 to 100) is evaluated to: 5050.

Here is how one would implement the sum() function in a DVC way:

  <xsl:function name="my:sum2">
      <xsl:param name="pNums" as="xs:double*"/>

      <xsl:sequence select=
          then 0
            if(count($pNums) eq 1)
              then $pNums[1]
                for $half in count($pNums) idiv 2
                    my:sum2($pNums[not(position() gt $half)]) 
                    my:sum2($pNums[position() gt $half])


The main idea behind DVC is to subdivide the input sequence into two (usually) or more parts and to process them independently from one another, then to combine the results in order to produce the result for the total input sequence.

Note that for a sequence of N items, the maximum depth of the call stack at any point od time would not exceed log2(N), which is more than enough for most practical purposes. For example, the maximum depth of the call stack when processing a sequence of 1000000 (1M) items, would be only 19.

While there are some XSLT processors that are not smart enough to recognize and optimize tail-recursion, a DVC-recursive template works on any XSLT processor.

