If the only non-stack-consuming looping construct in Clojure is "recur", how does this lazy-seq work?

StackOverflow https://stackoverflow.com/questions/22328304

문제

The ClojureDocs page for lazy-seq gives an example of generating a lazy-seq of all positive numbers:

(defn positive-numbers
  ([] (positive-numbers 1))
  ([n] (cons n (lazy-seq (positive-numbers (inc n))))))

This lazy-seq can be evaluated for pretty large indexes without throwing a StackOverflowError (unlike the sieve example on the same page):

user=> (nth (positive-numbers) 99999999)
100000000

If only recur can be used to avoid consuming stack frames in a recursive function, how is it possible this lazy-seq example can seemingly call itself without overflowing the stack?

도움이 되었습니까?

해결책

A lazy sequence has the rest of the sequence generating calculation in a thunk. It is not immediately called. As each element (or chunk of elements as the case may be) is requested, a call to the next thunk is made to retrieve the value(s). That thunk may create another thunk to represent the tail of the sequence if it continues. The magic is that (1) these special thunks implement the sequence interface and can transparently be used as such and (2) each thunk is only called once -- its value is cached -- so the realized portion is a sequence of values.

Here it is the general idea without the magic, just good ol' functions:

(defn my-thunk-seq 
  ([] (my-thunk-seq 1)) 
  ([n] (list n #(my-thunk-seq (inc n)))))

(defn my-next [s] ((second s)))

(defn my-realize [s n] 
  (loop [a [], s s, n n] 
    (if (pos? n) 
      (recur (conj a (first s)) (my-next s) (dec n)) 
      a)))

user=> (-> (my-thunk-seq) first)
1
user=> (-> (my-thunk-seq) my-next first)
2
user=> (my-realize (my-thunk-seq) 10)
[1 2 3 4 5 6 7 8 9 10]
user=> (count (my-realize (my-thunk-seq) 100000))
100000 ; Level stack consumption

The magic bits happen inside of clojure.lang.LazySeq defined in Java, but we can actually do the magic directly in Clojure (implementation that follows for example purposes), by implementing the interfaces on a type and using an atom to cache.

(deftype MyLazySeq [thunk-mem]
  clojure.lang.Seqable 
  (seq [_] 
    (if (fn? @thunk-mem) 
      (swap! thunk-mem (fn [f] (seq (f)))))
      @thunk-mem)
  ;Implementing ISeq is necessary because cons calls seq
  ;on anyone who does not, which would force realization.
  clojure.lang.ISeq
  (first [this] (first (seq this)))
  (next [this] (next (seq this)))
  (more [this] (rest (seq this)))
  (cons [this x] (cons x (seq this))))

(defmacro my-lazy-seq [& body] 
  `(MyLazySeq. (atom (fn [] ~@body))))

Now this already works with take, etc., but as take calls lazy-seq we'll make a my-take that uses my-lazy-seq instead to eliminate any confusion.

(defn my-take
  [n coll]
  (my-lazy-seq
   (when (pos? n)
     (when-let [s (seq coll)]
      (cons (first s) (my-take (dec n) (rest s)))))))

Now let's make a slow infinite sequence to test the caching behavior.

(defn slow-inc [n] (Thread/sleep 1000) (inc n))

(defn slow-pos-nums 
  ([] (slow-pos-nums 1)) 
  ([n] (cons n (my-lazy-seq (slow-pos-nums (slow-inc n))))))

And the REPL test

user=> (def nums (slow-pos-nums))
#'user/nums
user=> (time (doall (my-take 10 nums)))
"Elapsed time: 9000.384616 msecs"
(1 2 3 4 5 6 7 8 9 10)
user=> (time (doall (my-take 10 nums)))
"Elapsed time: 0.043146 msecs"
 (1 2 3 4 5 6 7 8 9 10)

다른 팁

Keep in mind that lazy-seq is a macro, and therefore does not evaluate its body when your positive-numbers function is called. In that sense, positive-numbers isn't truly recursive. It returns immediately, and the inner "recursive" call to positive-numbers doesn't happen until the seq is consumed.

user=> (source lazy-seq)
(defmacro lazy-seq
  "Takes a body of expressions that returns an ISeq or nil, and yields
  a Seqable object that will invoke the body only the first time seq
  is called, and will cache the result and return it on all subsequent
  seq calls. See also - realized?"
  {:added "1.0"}
  [& body]
  (list 'new 'clojure.lang.LazySeq (list* '^{:once true} fn* [] body)))

I think the trick is that the producer function (positive-numbers) isn't getting called recursively, it doesn't accumulate stack frames as if it was called with basic recursion Little-Schemer style, because LazySeq is invoking it as needed for the individual entries in the sequence. Once a closure gets evaluated for an entry then it can be discarded. So stack frames from previous invocations of the function can get garbage-collected as the code churns through the sequence.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top