Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Roundtripping xml in Clojure using clojure.xml/parse and clojure.xml/emit

Tags:

xml

clojure

Below roundtrip produces invaild xml as the result is not escaped correctly, i.e. the attribute values contain ' instead of apos;. Am I doing somthing wrong or is this a bug?

(ns xml-test
  (:require [clojure.xml :as xml])
  (:require [clojure.zip :as zip]))

(def test-xml "<?xml version="1.0" encoding="UTF-8"?> <main> <item attr='&apos;test&apos;'> </item> </main>")

(def s (ByteArrayInputStream. (.getBytes test-xml "UTF-8")))

(xml/emit (zip/root (zip/xml-zip (clojure.xml/parse s))))

output:

<?xml version='1.0' encoding='UTF-8'?>
<main>
<item attr=''test''/>
</main>
nil
like image 459
mac Avatar asked Mar 17 '10 14:03

mac


1 Answers

I've checked the source quickly and clojure.xml/emit-element (which gets called by clojure.xml/emit) makes no effort whatever to encode any characters as XML entities; in fact, it lets attribute values straight through. I guess this means clojure.xml is quite limited in its usability; you should use clojure.contrib.lazy-xml instead. My apologies for not mentioning it in the answer to your first question on XML emitting, I didn't realise stuff like this would happen.

With clojure.contrib.lazy-xml, you can do the following:

user> (lazy-xml/emit
       (lazy-xml/parse-trim
        (java.io.StringReader. "<foo bar=\"&apos;&quot;&quot;&apos;\"/>")))
<?xml version="1.0" encoding="UTF-8"?><foo bar="'&quot;&quot;'"/>

If you really wanted to use clojure.xml, you'd have to pass on clojure.xml/emit and use an XML producer of your choice instead. Well, actually, you can use clojure.xml/parse, mangle the result, then pass it to clojure.contrib.lazy-xml/emit; the structure of the Clojure representation of the XML is the same with both libraries, but only the latter does proper emitting.

like image 188
Michał Marczyk Avatar answered Sep 30 '22 23:09

Michał Marczyk