I am new to clojure so please bear with me. I have a XML which looks like this
<?xml version="1.0" encoding="UTF-8"?>
<XVar Id="cdx9" Type="Dictionary">
<XVar Id="Base.AccruedPremium" Type="Multi" Value="" Rows="1" Columns="1">
<Row Id="0">
<Col Id="0" Type="Num" Value="0"/>
</Row>
</XVar>
<XVar Id="TrancheAnalysis.IndexDuration" Type="Multi" Value="" Rows="1" Columns="1">
<Row Id="0">
<Col Id="0" Type="Num" Value="3.4380728252313069"/>
</Row>
</XVar>
<XVar Id="TrancheAnalysis.IndexLevel01" Type="Multi" Value="" Rows="1" Columns="1">
<Row Id="0">
<Col Id="0" Type="Num" Value="30693.926279941188"/>
</Row>
</XVar>
<XVar Id="TrancheAnalysis.TrancheDelta" Type="Multi" Value="" Rows="1" Columns="1">
<Row Id="0">
<Col Id="0" Type="Num" Value="8.9304387917502073"/>
</Row>
</XVar>
<XVar Id="TrancheAnalysis.TrancheDuration" Type="Multi" Value="" Rows="1" Columns="1">
<Row Id="0">
<Col Id="0" Type="Num" Value="3.0775955481964035"/>
</Row>
</XVar>
</XVar>
And it repeats. From this I want to be able to produce a CSV file with these columns
IndexName,TrancheAnalysis.IndexDuration,TrancheAnalysis.TrancheDuration
cdx9,3.4380728252313069,3.0775955481964035
.........................................
.........................................
I am able to parse a simple XML file like
<?xml version="1.0" encoding="UTF-8"?>
<CalibrationData>
<IndexList>
<Index>
<Calibrate>Y</Calibrate>
<UseClientIndexQuotes>Y</UseClientIndexQuotes>
<IndexName>HYCDX10</IndexName>
<Tenor>06/20/2013</Tenor>
<TenorName>3Y</TenorName>
<IndexLevels>219.6</IndexLevels>
<Tranche>Equity0To0.15</Tranche>
<TrancheStart>0</TrancheStart>
<TrancheEnd>0.15</TrancheEnd>
<UseBreakEvenSpread>1</UseBreakEvenSpread>
<UseTlet>0</UseTlet>
<IsTlet>0</IsTlet>
<PctExpectedLoss>0</PctExpectedLoss>
<UpfrontFee>52.125</UpfrontFee>
<RunningFee>0</RunningFee>
<DeltaFee>5.3</DeltaFee>
<CentralCorrelation>0.1</CentralCorrelation>
<Currency>USD</Currency>
<RescalingMethod>PTIndexRescaling</RescalingMethod>
<EffectiveDate>06/17/2011</EffectiveDate>
</Index>
</IndexList>
</CalibrationData>
with this code
(ns DynamicProgramming
(:require [clojure.xml :as xml]))
;Get the Input Files
(def calibrationFile "C:/ashwani/Eclipse/HistoricalTrancheAnalysis/src/CalibrationQuotes.xml")
(def mktdataFile "C:/ashwani/Eclipse/HistoricalTrancheAnalysis/src/MarketData.xml")
(def sample "C:/ashwani/Eclipse/HistoricalTrancheAnalysis/src/Sample.xml")
;Parse the Calibration Input File
(def CalibOp (for [x
(xml-seq
(xml/parse (java.io.File. calibrationFile)))
:when (or
(= :IndexName (:tag x))
(= :Tenor (:tag x))
(= :UpfrontFee (:tag x))
(= :RunningFee (:tag x))
(= :DeltaFee (:tag x))
(= :IndexLevels (:tag x))
(= :TrancheStart (:tag x))
(= :TrancheEnd (:tag x))
)]
(first(:content x))))
(println CalibOp)
But the second XML is simple; on the other hand I don't know how to iterate through the nested structure of the first XML example and extract the information I want.
Any help will be great.
I would use data.zip (Formerly clojure.contrib.zip-filter). It provides a lot of xml-parsing power and it's easily capable of performing xpath like expressions. The README describes it as a System for filtering trees, and XML trees in particular.
Below I have some sample code for creating a "row" for the CSV file. The row is a map of the column name to the attribute value.
(ns work
(:require [clojure.xml :as xml]
[clojure.zip :as zip]
[clojure.contrib.zip-filter.xml :as zf]))
; create a zip from the xml file
(def zip (zip/xml-zip (xml/parse "data.xml")))
; pulls out a list of all of the root "Id" attribute values
(zf/xml-> zip (zf/attr :Id))
(defn value [xvar-zip]
"Finds the id and value for a particular element"
(let [id (-> xvar-zip zip/node :attrs :Id) ; manual access
value (zf/xml1-> xvar-zip ; use xpath like expression to pull value out
:Row ; need the row element
:Col ; then the column element
(zf/attr :Value))] ; and finally pull the Value out
{id value}))
; gets the "column-value" pair for a single column
(zf/xml1-> zip
(zf/attr= :Id "cdx9") ; filter on id "cdx9"
:XVar ; filter on XVars under it
(zf/attr= :Id "TrancheAnalysis.IndexDuration") ; filter on id
value) ; apply the value function on the result of above
; creates a map of every column key to it's corresponding value
(apply merge (zf/xml-> zip (zf/attr= :Id "cdx9") :XVar value))
I'm not sure how the xml would work with multiple Dictionary XVars, as it is a root element. If you need to, one of the other functions which is useful for this type of work is mapcat
, which cat
s all of the values returned from the mapping function.
There are some more examples in the test source as well.
One other big recommendation I have is to make sure you use a lot of small functions. You'll find things much easier to debug, test, and work with.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With