recursion in implementation of “partition” function

https://stackoverflow.com//questions/21068555

02-01-2020
|

Question

I was randomly reading through Clojure source code and I saw that partition function was defined in terms of recursion without using "recur":

(defn partition
  ... ...
  ([n step coll]
     (lazy-seq
       (when-let [s (seq coll)]
         (let [p (doall (take n s))]
           (when (= n (count p))
             (cons p (partition n step (nthrest s step))))))))
  ... ...)

Is there any reason of doing this?

Solution

Partition is lazy. The recursive call to partition occurs within the body of a lazy-seq. Therefore, it is not immediately invoked but returned in a special seq-able object to be evaluated when needed and cache the results realized thus far. The stack depth is limited to one invocation at a time.

A recur without a lazy-seq could be used to create an eager version, but you would not want to use it on sequences of indeterminate length as you can with the version in core.

OTHER TIPS

To build on @A.Webb's answer and @amalloy's comment:

recur is not a shorthand to call the function and it's not a function. It's a special form (what in another language you would call a syntax) to perform tail call optimization.

Tail call optimization is a technique that allows to use recursion without blowing up the stack (without it, every recursive call adds its call frame to the stack). It is not implemented natively in Java (yet), which is why recur is used to mark a tail call in Clojure.

Recursion using lazy-seq is different because it delays the recursive call by wrapping it in a closure. What it means is that a call to a function implemented in terms of lazy-seq (and in particular every recursive call in such a function) returns (immediately) a LazySeq sequence, whose computation is delayed until it is iterated through.

To illustrate and qualify @amalloy's comment that recur and laziness are mutually exclusive, here's an implementation of filter that uses both techniques:

(defn filter [pred coll]
  (letfn [(step [pred coll]
            (when-let [[x & more] (seq coll)]
              (if (pred x)
                (cons x (lazy-seq (step pred more))) ;; lazy recursive call
                (recur pred more))))]                ;; tail call
    (lazy-seq (step pred coll))))

(filter even? (range 10))
;; => (0 2 4 6 8)

Both techniques can be used in the same function, but not for the same recursive call ; the function wouldn't compile if the lazy recursive call to step used recur because in this case recur wouldn't be in tail call position (the result of the tail call would not be returned directly but would be passed to lazy-seq).

All of the lazy functions are written this way. This implementation of partition would blow the stack without the call to lazy-seq for a large enough sequence.

Read a bit about TCO (tail call optimization) if you are more interested in how recur works. When you are using tail recursion it means you can jump out of your current function call without losing anything. In the case of this implementation you wouldn't be able to do that because you are cons-ing p on the next call of partition. Being in tail position means you are the last thing being called. In the implementation cons is in tail position. recur only works on tail position to guarantee TCO.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow