Question

I'm new to clojure and have been working with enlive to transform text nodes of html documents. My end goal is to convert the structure back into html, tags and all.

I'm currently able to take the structmap returned by enlive-html/html-resource and transform it back to html using

(apply str (html/emit* nodes))

where nodes is the structmap.

I'm also able to transform the structmap's :content text nodes as I wish. However, after transforming the content text nodes of the structmap, I end up with a lazyseq of MapEntries. I want to transform this back into a structmap so I can use emit* on it. This is a little tricky because the lazyseqs & structmaps are nested.

tldr:

How do I transform:

([:tag :html]
 [:attrs nil]
 [:content
  ("\n"
   ([:tag :head]
    [:attrs nil]
    [:content
     ("\n  "
      ([:tag :title] [:attrs nil] [:content ("Page Title")])
      "  \n")])
   "\n"
   ([:tag :body]
    [:attrs nil]
    [:content
     ("\n  "
      ([:tag :div]
       [:attrs {:id "wrap"}]
       [:content
        ("\n    "
         ([:tag :h1] [:attrs nil] [:content ("header")])
         "\n    "
         ([:tag :p] [:attrs nil] [:content ("some paragrah text")])
         "\n  ")])
      "\n")])
   "\n\n")])

into:

    {:tag :html,
 :attrs nil,
 :content
 ("\n"
  {:tag :head,
   :attrs nil,
   :content
   ("\n  " {:tag :title, :attrs nil, :content ("Page Title")} "  \n")}
  "\n"
  {:tag :body,
   :attrs nil,
   :content
   ("\n  "
    {:tag :div,
     :attrs {:id "wrap"},
     :content
     ("\n    "
      {:tag :h1, :attrs nil, :content ("header")}
      "\n    "
      {:tag :p, :attrs nil, :content ("some paragrah text")}
      "\n  ")}
    "\n")}
  "\n\n")}

Update

kotarak's response pointed me in the direction of update-in, which I was able to use to modify the map in place without transforming it to a sequence, thus rendering my question irrelevant.

(defn modify-or-go-deeper
  "If item is a map, updates its content, else if it's a string, modifies it"
  [item]
  (declare update-content)
  (cond
    (map? item) (update-content item)
    (string? item) (modify-text item)))

(defn update-content
  "Calls modify-or-go-deeper on each element of the :content sequence"
  [coll]
  (update-in coll [:content] (partial map modify-or-go-deeper)))

I was using for on the map before, but update-in is the way to go.

Was it helpful?

Solution

Just put everything back into a map and walk the content recursively.

(defn into-xml
  [coll]
  (let [tag (into {} coll)]
    (update-in tag [:content] (partial map into-xml))))

Note that the content is only transformed as you access it.

Edit: Woops, missed the string parts. Here a working version:

(defn into-xml
  [coll]
  (if-not (string? coll)
    (let [tag (into {} coll)]
      (update-in tag [:content] (partial map into-xml)))
    coll))

OTHER TIPS

Try

(def mp '([:tag :html] [:attrs nil] [:content
    (""
    ([:tag :head] [:attrs nil] [:content
        ("\n\t\t"
        ([:tag :title] [:attrs nil] [:content ("page title")])
        "\n\t\t")])
        "\n\t"
        ([:tag :body] [:attrs nil] [:content
            ("\n\t\t"
            ([:tag :div] [:attrs {:id "wrapper"}] [:content
            ("\n\t\t  "
            ([:tag :h1] [:attrs nil] [:content
                ("\n  \t\t\tpage title"
                ([:tag :br] [:attrs nil] [:content ()])
                "\n  \t\t\tand more title\n  \t\t")])
                "\n  \t\t"
                ([:tag :p] [:attrs nil] [:content
                    ("\n  \t\tSome paragraph text"
                    ([:tag :img] [:attrs {:src "images/image.png", :id "image"}] [:content nil])
                    "\n  \t\t")])
            "\n\t\t")]
            "\n\t     \n\t\t"))]
        "\n\n"))]))

(clojure.walk/postwalk (fn [x]
                         (if (and (list? x) (vector? (first x)))
                           (into {} x)
                           x))
                       mp)

It will throw an error, but if you change your input to

([:tag :html]
 [:attrs nil]
 [:content
  (""
   ([:tag :head]
    [:attrs nil]
    [:content
     ("\n\t\t"
      ([:tag :title] [:attrs nil] [:content ("page title")])
      "\n\t\t")])
   "\n\t"
   ([:tag :body]
    [:attrs nil]
    [:content
     ("\n\t\t"
      ([:tag :div]
       [:attrs {:id "wrapper"}]
       [:content
        ("\n\t\t  "
         ([:tag :h1]
          [:attrs nil]
          [:content
           ("\n  \t\t\tpage title"
            ([:tag :br] [:attrs nil] [:content ()])
            "\n  \t\t\tand more title\n  \t\t")])
         "\n  \t\t"
         ([:tag :p]
          [:attrs nil]
          [:content
           ("\n  \t\tSome paragraph text"
            ([:tag :img]
             [:attrs {:src "images/image.png", :id "image"}]
             [:content nil])
            "\n  \t\t")])
         "\n\t\t")]
       ))]))]))

then it works ok. The difference is that, in the edited input, you're removing the "\n\t\t"-like strings from the same list which contains your key-value pairs. Hope this helps.

Edit: The following worked for me:

(def mp '([:tag :html]
 [:attrs nil]
 [:content
  (""
   ([:tag :head]
    [:attrs nil]
    [:content
     ("\n\t\t"
      ([:tag :title] [:attrs nil] [:content ("page title")])
      "\n\t\t")])
   "\n\t"
   ([:tag :body]
    [:attrs nil]
    [:content
     ("\n\t\t"
      ([:tag :div]
       [:attrs {:id "wrapper"}]
       [:content
        ("\n\t\t  "
         ([:tag :h1]
          [:attrs nil]
          [:content
           ("\n  \t\t\tpage title"
            ([:tag :br] [:attrs nil] [:content ()])
            "\n  \t\t\tand more title\n  \t\t")])
         "\n  \t\t"
         ([:tag :p]
          [:attrs nil]
          [:content
           ("\n  \t\tSome paragraph text"
            ([:tag :img]
             [:attrs {:src "images/image.png", :id "image"}]
             [:content nil])
            "\n  \t\t")])
         "\n\t\t")]
       ))]))]))

(clojure.walk/postwalk (fn [x]
                         (if (and (list? x) (vector? (first x)))
                           (into {} x)
                           x))
                       mp)

Try copy and pasting it in a repl. You should get the following:

{:tag :html,
 :attrs nil,
 :content
 (""
  {:tag :head,
   :attrs nil,
   :content
   ("\n\t\t"
    {:tag :title, :attrs nil, :content ("page title")}
    "\n\t\t")}
  "\n\t"
  {:tag :body,
   :attrs nil,
   :content
   ("\n\t\t"
    {:tag :div,
     :attrs {:id "wrapper"},
     :content
     ("\n\t\t  "
      {:tag :h1,
       :attrs nil,
       :content
       ("\n  \t\t\tpage title"
        {:tag :br, :attrs nil, :content ()}
        "\n  \t\t\tand more title\n  \t\t")}
      "\n  \t\t"
      {:tag :p,
       :attrs nil,
       :content
       ("\n  \t\tSome paragraph text"
        {:tag :img,
         :attrs {:src "images/image.png", :id "image"},
         :content nil}
        "\n  \t\t")}
      "\n\t\t")})})}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top