Actually only chunked seqs are realized in batches of (normally1) 32 elements. Non-chunked seqs are realized one at a time. Functions like map
and filter
preserve the chunked / unchunked "mode" of their seq arguments.
You may thus be able to utilize regular Clojure sequence functions without compromising any amount of laziness if you make sure that you're passing a non-chunked seq to them. There are two possible approaches here, the second one of which is probably more applicable to your case:
Produce your seq without regard to whether it's going to be chunked or not; then, if it happens to be chunked, wrap it in an "unchunking seq":
(defn unchunk [xs] (lazy-seq (if-let [xs (seq xs)] (cons (first xs) (unchunk (rest xs)))))) user=> (->> (range 40) (unchunk) (map #(println "THIS IS" %)) first) THIS IS 0 user=> (->> (range 40) (map #(println "THIS IS" %)) first) THIS IS 0 THIS IS 1 THIS IS 2 ...
To use this approach with the example in the question text, you'd have to unchunk the seq over the vector
[1 2 3 4 5]
.Produce your initial seq (the innermost one in your transformation pipeline) in some way which does not happen to chunk the output. This may involve writing your own producers explicitly:
(defn my-seq-producer [& args] (lazy-seq (if ... (cons (foo) (my-seq-producer ...))))
The key thing to note here is that you're wrapping a
cons
call in a conditional insidelazy-seq
. If the test in the conditional is not satisfied, the conditional will producenil
and the lazy seq will turn out to be empty upon being realized; otherwise(foo)
will be produced as the first element of the output, followed by come "rest" part of the sequence, without any chunking.In particular, if you write your own producer of a lazy seq of items fetched over HTTP, you will be able to transform it using the core sequence functions while preserving full laziness.
The simplest way to tell which seq is chunked and which is not is to use the chunked-seq?
function, although there are two caveats:
You should probably use
chunked-seq?
on the result of callingseq
on whichever seq you're interested in, rather than the original seq itself. This is because your seq might be a chunked-seq-producing thunk wrapped in aLazySeq
object. In fact, this is the case withrange
.(chunked-seq? (range 40)) ;= false (chunked-seq? (seq (range 40))) ;= true
A seq may be partially chunked; for example you might
cons
something onto the front of a chunked seq, thereby producing a seq which is not chunked, but which nevertheless has a chunked "rest". Explicit unchunking deals with this happily, since it doesn't really check whether the underlying seq is chunked or not.
1 Consider a seq over a vector whose tail is less than 32 elements long.