Question

I naively implemented the trie from Okasaki's book "Purely Functional Data Structures" in Clojure. Just for learning purposes, because probably there is no real gain, since Clojure used a more efficient trie under the hood of normal maps, right? There is one thing that still bothers me about it: trie-assoc doesn't use recur. How could I refactor it so it only does a recursive call on the tail using recur? I have the feeling I could use zippers for this, but I haven't looked into zippers that much yet. One other thing that maybe could be improved is the (if (empty? key) ...) followed by a destructuring on key. Could this be more concise?

(ns okasaki-trie.core)

(def empty-trie {:value nil :nodes nil})

(defn trie-get [key trie]
  (if (empty? key) (get trie :value)
      (let [[k & ks] key]
        (recur ks (get-in trie [:nodes k])))))

(defn trie-assoc [key val trie]
  (if (empty? key) (assoc trie :value val) 
      (let [[k & ks] key
            sub-trie-k (or (get-in trie [:nodes k]) empty-trie)
            sub-trie-k-new (trie-assoc ks val sub-trie-k)]
        (assoc-in trie [:nodes k] sub-trie-k-new))))

;;(def first-trie (trie-assoc "a" 3 empty-trie))
;;first-trie => {:nodes {\a {:nodes nil, :value 3}}, :value nil}

;;(def second-trie (trie-assoc "ab" 5 first-trie))
;;second-trie => {:nodes {\a {:nodes {\b {:nodes nil, :value 5}},
;;:value 3}}, :value nil}

;;(def third-trie (trie-assoc "b" 2 second-trie))
;;third-trie  => {:nodes {\b {:nodes nil, :value 2}, \a {:nodes {\b
;;{:nodes nil, :value 5}}, :value 3}}, :value nil}

;;(trie-get "b" third-trie)  ;=> 2
;;(trie-get "ab" third-trie) ;=> 5
;;(trie-get "abc" third-trie)  ;=> nil
Was it helpful?

Solution

I haven't looked deeply into this algorithm, but I doubt if tail recursion is (usefully) possible. You have to walk down a tree, and then back up it to reconstruct the nodes on the path to the root. This requires a stack, hence no recur.

If you want, you can manage your own "stack" on the heap, by keeping a list of parent-nodes and manually walking them back to the root. Then you can use recur and avoid stack overflows, but you won't be using any less memory (probably more, really). So this is only a useful trick if you have trees with 2^1000 elements in them (and I rather doubt that you do).

Edit

Just noticed your question about

(if (empty? key)
  (...empty...)
  (let [[k & ks] key]
    (...more...)))

You can rewrite this more simply and concisely as:

(if-let [[k & ks] (seq key)]
  (...more...)
  (...empty...))

OTHER TIPS

I know this is both necromancy, and a tad off-topic, but I came across this question exploring implementing a trie in Clojure. I was compelled by tying to implement with immutability. Here is the solution I came up with:

(def  trie (atom {}))
(defn key-list     [s    ] (into [] (map str s)))
(defn trie-add     [t k v] (assoc-in t (conj (key-list k) :v) v))
(defn trie-query   [t k  ] (get-in (deref t) (conj (key-list k) :v)))
(defn trie-subtree [t k  ] (get-in (deref t) (key-list k)))

which provides a surprisingly clean implementation, so I wanted to share.

For cut-and-paste exploration:

(swap! trie trie-add "aac" 1)
(swap! trie trie-add "abc" 4)
(swap! trie trie-add "ab" 5)
(swap! trie trie-add "oaailfnalieuhkakjdfkjh" 1)
(swap! trie trie-add "aaailfnalieuhkakjdfkjh" 1)
(swap! trie trie-add "oaailfnalieancbnwiuybw" 1)
(swap! trie trie-add "oa" 3)

(trie-query trie "ab")
(trie-query trie "abc")
(trie-query trie "oaailfnalie")
(trie-query trie "oa")
(trie-query trie "oaailfnalieuhkakjdfkjh")

(trie-subtree trie "oa")
(trie-subtree trie "a")
(trie-subtree trie "oaailfnalieancbnwiuyb")
(trie-subtree trie "oaailfnalieancbnwiuy")
(trie-subtree trie "oaailfnalieancbnwiu")

(deref trie)

good times. Maybe this answers the "Could this be more concise" question.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top