Cartesian product in clojure

Question 1

This is a lot easier to do as a for-comprehension than by trying to work out the recursion manually:

(defn cart [colls]
  (if (empty? colls)
    '(())
    (for [more (cart (rest colls))
          x (first colls)]
      (cons x more))))

user> (cart '((a b c) (1 2 3) (black white)))
((a 1 black) (a 1 white) (a 2 black) (a 2 white) (a 3 black) (a 3 white) 
 (b 1 black) (b 1 white) (b 2 black) (b 2 white) (b 3 black) (b 3 white) 
 (c 1 black) (c 1 white) (c 2 black) (c 2 white) (c 3 black) (c 3 white))

The base case is obvious (it needs to be a list containing the empty list, not the empty list itself, since there is one way to take a cartesian product of no lists). In the recursive case, you just iterate over each element x of the first collection, and then over each cartesian product of the rest of the lists, prepending the x you've chosen.

Note that it's important to write the two clauses of the for comprehension in this slightly unnatural order: swapping them results in a substantial slowdown. The reason for this is to avoid duplicating work. The body of the second binding will be evaluated once for each item in the first binding, which (if you wrote the clauses in the wrong order) would mean many wasted copies of the expensive recursive clause. If you wish to be extra careful, you can make it clear that the two clauses are independent, by instead writing:

(let [c1 (first colls)]
  (for [more (cart (rest colls))
        x c1]
    (cons x more)))

Question 2

I would check https://github.com/clojure/math.combinatorics it has

(combo/cartesian-product [1 2] [3 4]) ;;=> ((1 3) (1 4) (2 3) (2 4))

Question 3

For the sake of comparison, in the spirit of the original

(defn cart 
  ([xs] 
   xs) 
  ([xs ys] 
   (mapcat (fn [x] (map (fn [y] (list x y)) ys)) xs)) 
  ([xs ys & more] 
   (mapcat (fn [x] (map (fn [z] (cons x z)) (apply cart (cons ys more)))) xs)))

(cart '(a b c) '(d e f) '(g h i))
;=> ((a d g) (a d h) (a d i) (a e g) (a e h) (a e i) (a f g) (a f h) (a f i)
;    (b d g) (b d h) (b d i) (b e g) (b e h) (b e i) (b f g) (b f h) (b f i) 
;    (c d g) (c d h) (c d i) (c e g) (c e h) (c e i) (c f g) (c f h) (c f i))

Question 4

I know I'm late to the party -- I just wanted to add a different approach, for the sake of completeness.

Compared to amalloy's approach, it is lazy too (the parameter lists are eagerly evaluated, though) and slightly faster when all results are required (I tested them both with the demo code below), however it is prone to stack overflow (much like the underlying for comprehension it generates and evaluates) as the number of lists increases. Also, keep in mind that eval has a limit to the size of the code it can be passed to.

Consider first a single instance of the problem: You want to find the cartesian product of [:a :b :c] and '(1 2 3). The obvious solution is to use a for comprehension, like this:

(for [e1 [:a :b :c]
      e2 '(1 2 3)]
  (list e1 e2))

; ((:a 1) (:a 2) (:a 3) (:b 1) (:b 2) (:b 3) (:c 1) (:c 2) (:c 3))

Now, the question is: Is it possible to generalize this in a way that works with an arbitrary number of lists? The answer here is affirmative. This is what the following macro does:

(defmacro cart [& lists]
  (let [syms (for [_ lists] (gensym))]
    `(for [~@(mapcat list syms lists)]
       (list ~@syms))))

(macroexpand-1 '(cart [:a :b :c] '(1 2 3)))

; (clojure.core/for [G__4356 [:a :b :c] 
;                    G__4357 (quote (1 2 3))] 
;   (clojure.core/list G__4356 G__4357))

(cart [:a :b :c] '(1 2 3))

; ((:a 1) (:a 2) (:a 3) (:b 1) (:b 2) (:b 3) (:c 1) (:c 2) (:c 3))

Essentially, you have the compiler generate the appropriate for comprehension for you. Converting this to a function is pretty straightforward, but there is a small catch:

(defn cart [& lists]
  (let [syms (for [_ lists] (gensym))]
    (eval `(for [~@(mapcat #(list %1 `'~%2) syms lists)]
             (list ~@syms)))))

(cart [:a :b :c] '(1 2 3))

; ((:a 1) (:a 2) (:a 3) (:b 1) (:b 2) (:b 3) (:c 1) (:c 2) (:c 3))

Lists that are left unquoted are treated as function calls, which is why quoting %2 is necessary here.

Online Demo:

; https://projecteuler.net/problem=205

(defn cart [& lists]
  (let [syms (for [_ lists] (gensym))]
    (eval `(for [~@(mapcat #(list %1 `'~%2) syms lists)]
             (list ~@syms)))))

(defn project-euler-205 []

  (let [rolls (fn [n d]
                (->> (range 1 (inc d))
                  (repeat n)
                  (apply cart)
                  (map #(apply + %))
                  frequencies))

        peter-rolls (rolls 9 4)
        colin-rolls (rolls 6 6)

        all-results (* (apply + (vals peter-rolls))
                       (apply + (vals colin-rolls)))

        peter-wins (apply + (for [[pk pv] peter-rolls
                                  [ck cv] colin-rolls
                                  :when (> pk ck)]
                              (* pv cv)))]

    (/ peter-wins all-results)))

(println (project-euler-205)) ; 48679795/84934656

Question 5

Personally, I would use amalloy's for solution. My general rule of thumb is that if my loop can be expressed as a single map/filter/etc call with a simple function argument (so a function name or short fn/#() form), its better to use the function. As soon as it gets more complex than that, a for expression is far easier to read. In particular, for is far better than nested maps. That said, if I didn't use for here, this is how I'd write the function:

(defn cart
  ([] '(()))
  ([xs & more]
    (mapcat #(map (partial cons %)
                  (apply cart more))
            xs)))

Things to note: First, there's no need for the reduce. Recursion can handle it just fine.

Second, only two cases. We can call the function just fine on an empty list, so all we care about is empty vs non-empty.

Third, as amalloy explained, the correct value of (cart) is '(()). This is actually rather subtle, and I reliably mess this up when I write a function like this. If you walk through a simple case very carefully, you should be able to see why that value makes the recursion work.

Fourth, I generally don't like to use fn. This is more of a personal preference, but I always use #(), partial, or comp if I can get away with it. #() is definitely idiomatic for smaller functions, though the other two are a bit less common.

Fifth, some style notes. The biggest issue is indentation. The best suggestion here is to find an editor that auto-indents lisp code. Auto-indentation is one of the most important things for your editor to provide, since it makes it blindingly obvious when your parens don't match up. Also, closing parens never go on their own line, fns don't need internal names unless you are planning on recursing, and I generally have a few more newlines than you do. I like to think that my code above is reasonably decently styled, and as another example, here is how I would format your code:

(defn cart
  ([] '())
  ([l1] (map list l1))
  ([l1 l2] 
    (map (fn [x]
           (map (fn [y]
                  (list x y))
                l2))
         l1)))

(defn cartesian-product [& lists] 
  (reduce cart lists))

Question 6

For most purposes Alan's answer is great as you get a lazy comprehension, and a lazy seq will not cause a stack overflow as you realize its members, even if you do not use (recur).

I was interested in trying to craft the tail recursive version with explicit recur, not the least of which because laziness wasn't going to be of any help in my application, but also for fun and giggles:

(defn cartesian-product
  ([cols] (cartesian-product '([]) cols))
  ([samples cols]
    (if (empty? cols)
      samples
      (recur (mapcat #(for [item (first cols)]
                        (conj % item)) samples)
             (rest cols)))))