Question

I'm new to Clojure. I was wondering how I could optimize an algorithm to count the number of inversions in a list. From what I understand, Clojure doesn't do tail call optimization unless specifically asked to? How do you get it to do this?

This first attempt with a mutated variable has a runtime of about 3.5s. But my second attempt was a functional version and it takes about 1m15s! and both require growing the stack size quite a bit (like -Xss12m).

How would I go about getting better performance?

I'd prefer to not have mutable variables (like the functional one) if possible. You can create the array file by typing something like seq 100000 | sort -R > IntArray.txt.

The first attempt w/ mutable variable:

(use 'clojure.java.io)

(def inversions 0)

(defn merge_and_count' [left right left_len]
  (if (empty? right) left
      (if (empty? left) right
          (if (<= (first left) (first right)) 
              (cons (first left)  (merge_and_count' (rest left) right (- left_len 1)))
              (let [_ (def inversions (+ inversions left_len))]
               (cons (first right) (merge_and_count' left (rest right) left_len)))
                  ))))

(defn inversion_count [list]
  (if (or (empty? list) (nil? (next list))) list
      (let [mid (quot (count list) 2)]
           (merge_and_count' (inversion_count (take mid list)) 
                             (inversion_count (drop mid list)) mid)
                                 )))

(defn parse-int [s]
   (Integer. (re-find  #"\d+" s )))

(defn get-lines [fname]
  (with-open [r (reader fname)]
    (doall (map parse-int (line-seq r)))))

(let [list (get-lines "IntArray.txt")
      _ (inversion_count list)]
  (print inversions))

My second attempt to be purely functional (no mutability):

(use 'clojure.java.io)


(defn merge_and_count' [left right inversions]
  (if (empty? right) (list left inversions)
      (if (empty? left) (list right inversions)
          (if (<= (first left) (first right)) 
              (let [result (merge_and_count' (rest left) right inversions)]
                   (list (cons (first left) (first result)) (second result)))
              (let [result (merge_and_count' left (rest right) (+ inversions (count left)))]
                   (list (cons (first right) (first result)) (second result)))
                       ))))

(defn inversion_count [list' list_len]
  (if (or (empty? list') (nil? (next list'))) (list list' 0)
      (let [mid (quot list_len 2)
            left (inversion_count (take mid list') mid)
            right (inversion_count (drop mid list') (- list_len mid))]
           (merge_and_count' (first left) (first right) (+ (second left) (second right)))
                                 )))

(defn parse-int [s]
   (Integer. (re-find  #"\d+" s )))

(defn get-lines [fname]
  (with-open [r (reader fname)]
    (doall (map parse-int (line-seq r)))))

(let [list (get-lines "IntArray.txt")
      result (inversion_count list 100000)]
  (print (second result)))
Was it helpful?

Solution

The stack overflows due to the recursion in merge-and-count. I tried this approach, and for 100000 items, it came back instantly.

(defn merge_and_count [left right inversions]
  (loop [l left r right inv inversions result []]
    (cond (and (empty? r) (empty? l)) [result inv]
          (empty? r) [(apply conj result l) inv]
          (empty? l) [(apply conj result r) inv]
          (<= (first l) (first r)) (recur (rest l) r inv (conj result (first l)))
          :else (recur l (rest r) (+ inv (count l))  (conj result (first r))))))

You need to replace this code with code from your second approach.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top