Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Searching xml in Clojure

Tags:

xml

clojure

I have the following sample xml:

<data>
  <products>
    <product>
      <section>Red Section</section>
      <images>
        <image>img.jpg</image>
        <image>img2.jpg</image>
      </images>
    </product>
    <product>
      <section>Blue Section</section>
      <images>
        <image>img.jpg</image>
        <image>img3.jpg</image>
      </images>
    </product>
    <product>
      <section>Green Section</section>
      <images>
        <image>img.jpg</image>
        <image>img2.jpg</image>
      </images>
    </product>
  </products>
</data>

I know how to parse it in Clojure

(require '[clojure.xml :as xml])
(def x (xml/parse 'location/of/that/xml'))

This returns a nested map describing the xml

{:tag :data,
 :attrs nil,
 :content [
     {:tag :products,
      :attrs nil,
      :content [
          {:tag :product,
           :attrs nil,
           :content [] ..

This structure can of course be traversed with standard Clojure functions, but it may turn out to be really verbose, especially if compared to, for instance, querying it with XPath. Is there any helper to traverse and search such structure? How can I, for example

  • get a list of all <product>
  • get only the product whose <images> tag contains an <image> with text "img2.jpg"
  • get the product whose section is "Red Section"

Thanks

like image 441
pistacchio Avatar asked Jul 18 '12 09:07

pistacchio


2 Answers

Using Zippers from data.zip here is a solution for your second use case:

(ns core
  (:use clojure.data.zip.xml)
  (:require [clojure.zip :as zip]
            [clojure.xml :as xml]))

(def data (zip/xml-zip (xml/parse PATH)))
(def products (xml-> data :products :product))

(for [product products :let [image (xml-> product :images :image)]
                       :when (some (text= "img2.jpg") image)]
  {:section (xml1-> product :section text)
   :images (map text image)})
=> ({:section "Red Section", :images ("img.jpg" "img2.jpg")}
    {:section "Green Section", :images ("img.jpg" "img2.jpg")})
like image 85
ponzao Avatar answered Oct 16 '22 15:10

ponzao


Here's an alternate version using data.zip, for all three usecases. I've found that xml-> and xml1-> has pretty powerful navigation built-in, with sub-queries in vectors.

;; [org.clojure/data.zip "0.1.1"]

(ns example.core
  (:require
   [clojure.zip :as zip]
   [clojure.xml :as xml]
   [clojure.data.zip.xml :refer [text xml-> xml1->]]))

(def data (zip/xml-zip (xml/parse "/tmp/products.xml")))

(let [all-products (xml-> data :products :product)
      red-section (xml1-> data :products :product [:section "Red Section"])
      img2 (xml-> data :products :product [:images [:image "img2.jpg"]])]
  {:all-products (map (fn [product] (xml1-> product :section text)) all-products)
   :red-section (xml1-> red-section :section text)
   :img2 (map (fn [product] (xml1-> product :section text)) img2)})

=> {:all-products ("Red Section" "Blue Section" "Green Section"),
    :red-section "Red Section",
    :img2 ("Red Section" "Green Section")}
like image 37
Terje Sten Bjerkseth Avatar answered Oct 16 '22 17:10

Terje Sten Bjerkseth