I am trying to get a handle on the new defprotocol
, reify
, etc.
I have a org.w3c.dom.NodeList
returned from an XPath call and I would like to "convert" it to an ISeq.
In Scala, I implemented an implicit conversion method:
implicit def nodeList2Traversable(nodeList: NodeList): Traversable[Node] = {
new Traversable[Node] {
def foreach[A](process: (Node) => A) {
for (index <- 0 until nodeList.getLength) {
process(nodeList.item(index))
}
}
}
}
NodeList
includes methods int getLength()
and Node item(int index)
.
How do I do the equivalent in Clojure? I expect that I will need to use defprotocol
. What functions do I need to define to create a seq
?
If I do a simple, naive, conversion to a list using loop
and recur
, I will end up with a non-lazy structure.
Most of Clojure's sequence-processing functions return lazy seqs, include the map
and range
functions:
(defn node-list-seq [^org.w3c.dom.NodeList node-list]
(map (fn [index] (.item node-list index))
(range (.getLength node-list))))
Note the type hint for NodeList above isn't necessary, but improves performance.
Now you can use that function like so:
(map #(.getLocalName %) (node-list-seq your-node-list))
Use a for comprehension, these yield lazy sequences.
Here's the code for you. I've taken the time to make it runnable on the command line; you only need to replace the name of the parsed XML file.
Caveat 1: avoid def-ing your variables. Use local variables instead.
Caveat 2: this is the Java API for XML, so there objects are mutable; since you have a lazy sequence, if any changes happen to the mutable DOM tree while you're iterating, you might have unpleasant race changes.
Caveat 3: even though this is a lazy structure, the whole DOM tree is already in memory anyway (I'm not really sure about this last comment, though. I think the API tries to defer reading the tree in memory until needed, but, no guarantees). So if you run into trouble with big XML documents, try to avoid the DOM approach.
(require ['clojure.java.io :as 'io])
(import [javax.xml.parsers DocumentBuilderFactory])
(import [org.xml.sax InputSource])
(def dbf (DocumentBuilderFactory/newInstance))
(doto dbf
(.setValidating false)
(.setNamespaceAware true)
(.setIgnoringElementContentWhitespace true))
(def builder (.newDocumentBuilder dbf))
(def doc (.parse builder (InputSource. (io/reader "C:/workspace/myproject/pom.xml"))))
(defn lazy-child-list [element]
(let [nodelist (.getChildNodes element)
len (.getLength nodelist)]
(for [i (range len)]
(.item nodelist i))))
;; To print the children of an element
(-> doc
(.getDocumentElement)
(lazy-child-list)
(println))
;; Prints clojure.lang.LazySeq
(-> doc
(.getDocumentElement)
(lazy-child-list)
(class)
(println))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With