Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a parser for html to hiccup structures?

Tags:

clojure

hiccup

I'm looking for a function that reverses clojure hiccup

so

   <html></html>

turns into

[:html]

etc.


Following up from the answer by @kotarak, This now works for me:

(use 'net.cgrand.enlive-html)
(import 'java.io.StringReader)

(defn enlive->hiccup
   [el]
   (if-not (string? el)
     (->> (map enlive->hiccup (:content el))
       (concat [(:tag el) (:attrs el)])
       (keep identity)
       vec)
     el))

(defn html->enlive 
  [html]
  (first (html-resource (StringReader. html))))

(defn html->hiccup [html]
  (-> html
      html->enlive
      enlive->hiccup))

=> (html->hiccup "<html><body id='foo'>hello</body></html>")
[:html [:body {:id "foo"} "hello"]]
like image 338
zcaudate Avatar asked Jun 19 '12 05:06

zcaudate


1 Answers

You could html-resource from enlive to get a structure like this:

{:tag :html :attrs {} :content []}

Then traverse this and turn it into a hiccup structure.

(defn html->hiccup
   [html]
   (if-not (string? html)
     (->> (map html->hiccup (:content html))
       (concat [(:tag html) (:attrs html)])
       (keep identity)
       vec)
     html))

Here a usage example:

user=>  (html->hiccup {:tag     :p
                       :content ["Hello" {:tag     :a
                                          :attrs   {:href "/foo"}
                                          :content ["World"]}
                                 "!"]})
[:p "Hello" [:a {:href "/foo"} "World"] "!"]
like image 73
kotarak Avatar answered Oct 06 '22 01:10

kotarak