Question

I am a Java programmer and am new to Clojure. From different places, I saw sequence and collection are used in different cases. However, I have no idea what the exact difference is between them.

For some examples:

1) In Clojure's documentation for Sequence:

The Seq interface
(first coll)
  Returns the first item in the collection. 
  Calls seq on its argument. If coll is nil, returns nil.
(rest coll)
  Returns a sequence of the items after the first. Calls seq on its argument. 
  If there are no more items, returns a logical sequence for which seq returns nil.
(cons item seq)
  Returns a new seq where item is the first element and seq is the rest.

As you can see, when describing the Seq interface, the first two functions (first/rest) use coll which seems to indicate this is a collection while the cons function use seq which seems to indicate this is a sequence.

2) There are functions called coll? and seq? that can be used to test if a value is a collection or a sequence. It is clearly collection and sequence are different.

3) In Clojure's documentation about 'Collections', it is said:

Because collections support the seq function, all of the sequence functions can be used with any collection

Does this mean all collections are sequences?

(coll? [1 2 3]) ; => true 
(seq? [1 2 3]) ; => false

The code above tells me it is not such case because [1 2 3] is a collection but is not a sequence.

I think this is a pretty basic question for Clojure but I am not able to find a place explaining this clearly what their difference is and which one should I use in different cases. Any comment is appreciated.

Was it helpful?

Solution 3

Every sequence is a collection, but not every collection is a sequence.

The seq function makes it possible to convert a collection into a sequence. E.g. for a map you get a list of its entries. That list of entries is different from the map itself, though.

OTHER TIPS

Any object supporting the core first and rest functions is a sequence.

Many objects satisfy this interface and every Clojure collection provides at least one kind of seq object for walking through its contents using the seq function.

So:

user> (seq [1 2 3])
    (1 2 3)

And you can create a sequence object from a map too

user> (seq {:a 1 :b 2})
    ([:a 1] [:b 2])

That's why you can use filter, map, for, etc. on maps sets and so on.

So you can treat many collection-like objects as sequences.

That's also why many sequence handling functions such as filter call seq on the input:

 (defn filter
  "Returns a lazy sequence of the items in coll for which
  (pred item) returns true. pred must be free of side-effects."
  {:added "1.0"
   :static true}
  ([pred coll]
   (lazy-seq
      (when-let [s (seq coll)]

If you call (filter pred 5)

  Don't know how to create ISeq from: java.lang.Long
                  RT.java:505 clojure.lang.RT.seqFrom
                  RT.java:486 clojure.lang.RT.seq
                 core.clj:133 clojure.core/seq
                core.clj:2523 clojure.core/filter[fn]

You see that seq call is the is this object a sequence validation.

Most of this stuff is in Joy of Clojure chapter 5 if you want to go deeper.

Here are few points that will help understand the difference between collection and sequence.

  1. "Collection" and "Sequence" are abstractions, not a property that can be determined from a given value.

  2. Collections are bags of values.

  3. Sequence is a data structure (subset of collection) that is expected to be accessed in a sequential (linear) manner.

The figure below best describes the relation between them:

enter image description here

You can read more about it here.

In Clojure for the brave and true the author sums it up in a really understandable way:

The collection abstraction is closely related to the sequence abstraction. All of Clojure's core data structures — vectors, maps, lists and sets — take part in both abstractions.

The abstractions differ in that the sequence abstraction is "about" operating on members individually while the collection abstraction is "about" the data structure as a whole. For example, the collection functions count, empty?, and every? aren't about any individual element; they're about the whole.

I have just been through Chapter 5 - "Collection Types" of "The Joy of Clojure", which is a bit confusing (i.e. the next version of that book needs a review). In Chapter 5, on page 86, there is a table which I am not fully happy with:

Table 5.1 from the Joy of Clojure, 2nd ed.

So here's my take (fully updated after coming back to this after a month of reflection).

collection

It's a "thing", a collection of other things.

This is based on the function coll?.

  • The function coll? can be used to test for this.
  • Conversely, anything for which coll? returns true is a collection.

The coll? docstring says:

Returns true if x implements IPersistentCollection

Things that are collections as grouped into three separate classes. Things in different classes are never equal.

  • Maps Test using (map? foo)
    • Map (two actual implementations with slightly differing behaviours)
    • Sorted map. Note: (sequential? (sorted-map :a 1) ;=> false
  • Sets Test using (set? foo)
    • Set
    • Sorted set. Note: (sequential? (sorted-set :a :b)) ;=> false
  • Sequential collections Test using (sequential? foo)
    • List
    • Vector
    • Queue
    • Seq: (sequential? (seq [1 2 3])) ;=> true
    • Lazy-Seq: (sequential? (lazy-seq (seq [1 2 3]))) ;=> true

The Java interop stuff is outside of this:

  • (coll? (to-array [1 2 3])) ;=> false
  • (map? (doto (new java.util.HashMap) (.put "a" 1) (.put "b" 2))) ;=> false

sequential collection (a "chain")

It's a "thing", a collection holding other things according to a specific, stable ordering.

This is based on the function sequential?.

  • The function sequential? can be used to test for this.
  • Conversely, anything for which sequential? returns true is a sequential collection.

The sequential? docstring says:

Returns true if coll implements Sequential

Note: "sequential" is an adjective! In "The Joy of Clojure", the adjective is used as a noun and this is really, really, really confusing:

"Clojure classifies each collection data type into one of three logical categories or partitions: sequentials, maps, and sets."

Instead of "sequential" one should use a "sequential thing" or a "sequential collection" (as used above). On the other hand, in mathematics the following words already exist: "chain", "totally ordered set", "simply ordered set", "linearly ordered set". "chain" sounds excellent but no-one uses that word. Shame!

"Joy of Clojure" also has this to say:

Beware type-based predicates!

Clojure includes a few predicates with names like the words just defined. Although they’re not frequently used, it seems worth mentioning that they may not mean exactly what the definitions here might suggest. For example, every object for which sequential? returns true is a sequential collection, but it returns false for some that are also sequential [better: "that can be considered sequential collections"]. This is because of implementation details that may be improved in a future version of Clojure [and maybe this has already been done?]

sequence (also "sequence abstraction")

This is more a concept than a thing: a series of values (thus ordered) which may or may not exist yet (i.e. a stream). If you say that a thing is a sequence, is that thing also necessarily a Clojure collection, even a sequential collection? I suppose so.

That sequential collection may have been completely computed and be completely available. Or it may be a "machine" to generate values on need (by computation - likely in a "pure" fashion - or by querying external "impure", "oracular" sources: keyboard, databases)

seq

This is a thing: something that can be processed by the functions first, rest, next, cons (and possibly others?), i.e. something that obeys the protocol clojure.lang.ISeq (which is about the same concept as "providing an implementation for an interface" in Java), i.e. the system has registered function implementations for a pair (thing, function-name) [I sure hope I get this right...]

This is based on the function seq?.

  • The function seq? can be used to test for this
  • Conversely, a seq is anything for which seq? returns true.

Docstring for seq?:

Return true if x implements ISeq

Docstring for first:

Returns the first item in the collection. Calls seq on its argument. If coll is nil, returns nil.

Docstring for rest:

Returns a possibly empty seq of the items after the first. Calls seq on its argument.

Docstring for next:

Returns a seq of the items after the first. Calls seq on its argument. If there are no more items, returns nil.

You call next on the seq to generate the next element and a new seq. Repeat until nil is obtained.

Joy of Clojure calls this a "simple API for navigating collections" and says "a seq is any object that implements the seq API" - which is correct if "the API" is the ensemble of the "thing" (of a certain type) and the functions which work on that thing. It depends on suitable shift in the concept of API.

A note on the special case of the empty seq:

(def empty-seq (rest (seq [:x])))

(type? empty-seq)                 ;=> clojure.lang.PersistentList$EmptyList

(nil? empty-seq)                  ;=> false ... empty seq is not nil
(some? empty-seq)                 ;=> true ("true if x is not nil, false otherwise.")

(first empty-seq)                 ;=> nil   ... first of empty seq is nil ("does not exist"); beware confusing this with a nil in a nonempty list!
(next empty-seq)                  ;=> nil   ... "next" of empty seq is nil
(rest empty-seq)                  ;=> ()    ... "rest" of empty seq is the empty seq
   (type (rest empty-seq))        ;=> clojure.lang.PersistentList$EmptyList
   (seq? (rest empty-seq))        ;=> true
   (= (rest empty-seq) empty-seq) ;=> true

(count empty-seq)                 ;=> 0
(empty? empty-seq)                ;=> true

Addenda

The function seq

If you apply the function seq to a thing for which that makes sense (generally a sequential collection), you get a seq representing/generating the members of that collection.

The docstring says:

Returns a seq on the collection. If the collection is empty, returns nil. (seq nil) returns nil. seq also works on Strings, native Java arrays (of reference types) and any objects that implement Iterable. Note that seqs cache values, thus seq should not be used on any Iterable whose iterator repeatedly returns the same mutable object.

After applying seq, you may get objects of various actual classes:

  • clojure.lang.Cons - try (class (seq (map #(* % 2) '( 1 2 3))))
  • clojure.lang.PersistentList
  • clojure.lang.APersistentMap$KeySeq
  • clojure.lang.PersistentList$EmptyList
  • clojure.lang.PersistentHashMap$NodeSeq
  • clojure.lang.PersistentQueue$Seq
  • clojure.lang.PersistentVector$ChunkedSeq

If you apply seq to a sequence, the actual class of the thing returned may be different from the actual class of the thing passed in. It will still be a sequence.

What the "elements" in the sequence are depends. For example, for maps, they are key-value pairs which look like 2-element vector (but their actual class is not really a vector).

The function lazy-seq

Creates a thing to generate more things lazily (a suspended machine, a suspended stream, a thunk)

The docstring says:

Takes a body of expressions that returns an ISeq or nil, and yields a Seqable object that will invoke the body only the first time seq is called, and will cache the result and return it on all subsequent seq calls. See also - realized?"

A note on "functions" and "things" ... and "objects"

In the Clojure Universe, I like to talk about "functions" and "things", but not about "objects", which is a term heavily laden with Java-ness and other badness. Mention of objects feels like shards poking up from the underlying Java universe.

What is the difference between function and thing?

It's fluid! Some stuff is pure function, some stuff is pure thing, some is in between (can be used as function and has attributes of a thing)

In particular, Clojure allows contexts where one considers keywords (things) as functions (to look up values in maps) or where one interpretes maps (things) as functions, or shorthand for functions (which take a key and return the value associated to that key in the map)

Evidently, functions are things as they are "first-class citizens".

It's also contextual! In some contexts, a function becomes a thing, or a thing becomes a function.

There are nasty mentions of objects ... these are shards poking up from the underlying Java universe.

For presentation purposes, a diagram of Collections

Collections in Clojure

For seq?:

Return true if x implements ISeq

For coll?:

Returns true if x implements IPersistentCollection

And I found ISeq interface extends from IPersistentCollection in Clojure source code, so as Rörd said, every sequences is a collection.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top