I have XML data formatted in this fashion:
<XML>
<Waveforms Time="01/01/2009 3:00:02 AM">
<WaveformData Channel="I">1, 2, 3, 4, 5, 6 </WaveformData>
<WaveformData Channel="II">9, 8, 7, 6, 5, 4 </WaveformData>
</Waveforms>
<Waveforms Time="01/01/2009 3:00:04 AM">
<WaveformData Channel="I">1, 2, 3, 4, 5, 6 </WaveformData>
<WaveformData Channel="II">9, 8, 7, 6, 5, 4 </WaveformData>
</Waveforms>
</XML>
I am trying to use xmlstarlet to parse this data to a text file (comma delimited). The desired output would look like this:
Time Attribute, Channel Attribute, Data
01/01/2009 3:00:02 AM, I, 1, 2, 3, 4, 5, 6
01/01/2009 3:00:02 AM, II, 9, 8, 7, 6, 5, 4
01/01/2009 3:00:02 AM, I, 1, 2, 3, 4, 5, 6
01/01/2009 3:00:02 AM, II, 9, 8, 7, 6, 5, 4
The best I can come up with is:
xmlstarlet sel -T -t -m //XML/Waveforms -v @Time -o "," -m Waves -v WaveformData/@Channel -o "," -v WaveformData -o "," -b -n testwave2.xml > testwave.txt
Which gives a result like this:
01/01/2009 3:00:02 AM, I, 1, 2, 3, 4, 5, 6, II, 9, 8, 7, 6, 5, 4
01/01/2009 3:00:04 AM, I, 1, 2, 3, 4, 5, 6, II, 9, 8, 7, 6, 5, 4
It's clear how to print one line per Waveforms, but not how to print one line per WaveformData if I want to include the time attribute from its parent. Can this be done? Alternately, should I work around and do some slicing and pasting to fix it on the back end afterwards?
Search for the WaveformData -- given as it's what you want one line per each of -- and just traverse upwards in the tree to find your time element.
$ xmlstarlet sel -T -t -m /XML/Waveforms/WaveformData \
-v ../@Time -o "," \
-v @Channel -o "," \
-v . -n <in.xml
01/01/2009 3:00:02 AM,I,1, 2, 3, 4, 5, 6
01/01/2009 3:00:02 AM,II,9, 8, 7, 6, 5, 4
01/01/2009 3:00:04 AM,I,1, 2, 3, 4, 5, 6
01/01/2009 3:00:04 AM,II,9, 8, 7, 6, 5, 4
Alternately, if you know that each Waveforms will have exactly two WaveformData children, you could do the following:
$ xmlstarlet sel -T -t -m /XML/Waveforms \
-v ./@Time -o ",I," -v './WaveformData[@Channel="I"]' -n \
-v ./@Time -o ",II," -v './WaveformData[@Channel="II"]' -n <in.xml
01/01/2009 3:00:02 AM,I,1, 2, 3, 4, 5, 6
01/01/2009 3:00:02 AM,II,9, 8, 7, 6, 5, 4
01/01/2009 3:00:04 AM,I,1, 2, 3, 4, 5, 6
01/01/2009 3:00:04 AM,II,9, 8, 7, 6, 5, 4
To give a slight variation of Charles Duffy's answer, you could use the concat()
function to simplify it a bit, and an initial template to provide the CSV header:
$ xmlstarlet sel \
-t -o 'Time Attribute, Channel Attribute, Data' -n \
-t -m '//Waveforms/WaveformData' \
-v 'concat(../@Time, ", ", @Channel, ", ", text())' -n \
waveforms.xml
Time Attribute, Channel Attribute, Data
01/01/2009 3:00:02 AM, I, 1, 2, 3, 4, 5, 6
01/01/2009 3:00:02 AM, II, 9, 8, 7, 6, 5, 4
01/01/2009 3:00:04 AM, I, 1, 2, 3, 4, 5, 6
01/01/2009 3:00:04 AM, II, 9, 8, 7, 6, 5, 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With