recursion in implementation of “partition” function
-
02-01-2020 - |
Question
I was randomly reading through Clojure source code and I saw that partition function was defined in terms of recursion without using "recur":
(defn partition
... ...
([n step coll]
(lazy-seq
(when-let [s (seq coll)]
(let [p (doall (take n s))]
(when (= n (count p))
(cons p (partition n step (nthrest s step))))))))
... ...)
Is there any reason of doing this?
Solution
Partition is lazy. The recursive call to partition
occurs within the body of a lazy-seq
. Therefore, it is not immediately invoked but returned in a special seq-able object to be evaluated when needed and cache the results realized thus far. The stack depth is limited to one invocation at a time.
A recur without a lazy-seq could be used to create an eager version, but you would not want to use it on sequences of indeterminate length as you can with the version in core.
OTHER TIPS
To build on @A.Webb's answer and @amalloy's comment:
recur
is not a shorthand to call the function and it's not a function. It's a special form (what in another language you would call a syntax) to perform tail call optimization.
Tail call optimization is a technique that allows to use recursion without blowing up the stack (without it, every recursive call adds its call frame to the stack). It is not implemented natively in Java (yet), which is why recur
is used to mark a tail call in Clojure.
Recursion using lazy-seq
is different because it delays the recursive call by wrapping it in a closure. What it means is that a call to a function implemented in terms of lazy-seq
(and in particular every recursive call in such a function) returns (immediately) a LazySeq
sequence, whose computation is delayed until it is iterated through.
To illustrate and qualify @amalloy's comment that recur
and laziness are mutually exclusive, here's an implementation of filter
that uses both techniques:
(defn filter [pred coll]
(letfn [(step [pred coll]
(when-let [[x & more] (seq coll)]
(if (pred x)
(cons x (lazy-seq (step pred more))) ;; lazy recursive call
(recur pred more))))] ;; tail call
(lazy-seq (step pred coll))))
(filter even? (range 10))
;; => (0 2 4 6 8)
Both techniques can be used in the same function, but not for the same recursive call ; the function wouldn't compile if the lazy recursive call to step
used recur
because in this case recur
wouldn't be in tail call position (the result of the tail call would not be returned directly but would be passed to lazy-seq
).
All of the lazy functions are written this way. This implementation of partition
would blow the stack without the call to lazy-seq
for a large enough sequence.
Read a bit about TCO (tail call optimization) if you are more interested in how recur
works. When you are using tail recursion it means you can jump out of your current function call without losing anything. In the case of this implementation you wouldn't be able to do that because you are cons
-ing p
on the next call of partition
. Being in tail position means you are the last thing being called. In the implementation cons
is in tail position. recur
only works on tail position to guarantee TCO.