Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XMLStarlet: Printing one line per item, while using datum from parent element

I have XML data formatted in this fashion:

<XML>
    <Waveforms Time="01/01/2009 3:00:02 AM">
        <WaveformData Channel="I">1, 2, 3, 4, 5, 6 </WaveformData>
        <WaveformData Channel="II">9, 8, 7, 6, 5, 4 </WaveformData>
    </Waveforms>
    <Waveforms Time="01/01/2009 3:00:04 AM">
        <WaveformData Channel="I">1, 2, 3, 4, 5, 6 </WaveformData>
        <WaveformData Channel="II">9, 8, 7, 6, 5, 4 </WaveformData>
    </Waveforms>
</XML>

I am trying to use xmlstarlet to parse this data to a text file (comma delimited). The desired output would look like this:

Time Attribute, Channel Attribute, Data
01/01/2009 3:00:02 AM, I, 1, 2, 3, 4, 5, 6
01/01/2009 3:00:02 AM, II, 9, 8, 7, 6, 5, 4
01/01/2009 3:00:02 AM, I, 1, 2, 3, 4, 5, 6
01/01/2009 3:00:02 AM, II, 9, 8, 7, 6, 5, 4

The best I can come up with is:

 xmlstarlet sel -T -t -m //XML/Waveforms -v @Time -o "," -m Waves -v WaveformData/@Channel -o "," -v WaveformData -o "," -b -n testwave2.xml > testwave.txt

Which gives a result like this:

 01/01/2009 3:00:02 AM, I, 1, 2, 3, 4, 5, 6, II, 9, 8, 7, 6, 5, 4
 01/01/2009 3:00:04 AM, I, 1, 2, 3, 4, 5, 6, II, 9, 8, 7, 6, 5, 4

It's clear how to print one line per Waveforms, but not how to print one line per WaveformData if I want to include the time attribute from its parent. Can this be done? Alternately, should I work around and do some slicing and pasting to fix it on the back end afterwards?

like image 860
Tom Fogarty Avatar asked Jan 04 '23 12:01

Tom Fogarty


2 Answers

Search for the WaveformData -- given as it's what you want one line per each of -- and just traverse upwards in the tree to find your time element.

$ xmlstarlet sel -T -t -m /XML/Waveforms/WaveformData \
     -v ../@Time -o "," \
     -v @Channel -o "," \
     -v . -n <in.xml
01/01/2009 3:00:02 AM,I,1, 2, 3, 4, 5, 6 
01/01/2009 3:00:02 AM,II,9, 8, 7, 6, 5, 4 
01/01/2009 3:00:04 AM,I,1, 2, 3, 4, 5, 6 
01/01/2009 3:00:04 AM,II,9, 8, 7, 6, 5, 4 

Alternately, if you know that each Waveforms will have exactly two WaveformData children, you could do the following:

$ xmlstarlet sel -T -t -m /XML/Waveforms \
    -v ./@Time -o ",I,"  -v './WaveformData[@Channel="I"]' -n \
    -v ./@Time -o ",II," -v './WaveformData[@Channel="II"]' -n <in.xml
01/01/2009 3:00:02 AM,I,1, 2, 3, 4, 5, 6
01/01/2009 3:00:02 AM,II,9, 8, 7, 6, 5, 4
01/01/2009 3:00:04 AM,I,1, 2, 3, 4, 5, 6
01/01/2009 3:00:04 AM,II,9, 8, 7, 6, 5, 4
like image 74
Charles Duffy Avatar answered Jan 14 '23 13:01

Charles Duffy


To give a slight variation of Charles Duffy's answer, you could use the concat() function to simplify it a bit, and an initial template to provide the CSV header:

$ xmlstarlet sel \
    -t -o 'Time Attribute, Channel Attribute, Data' -n \
    -t -m '//Waveforms/WaveformData' \
       -v 'concat(../@Time, ", ", @Channel, ", ", text())' -n \
  waveforms.xml
Time Attribute, Channel Attribute, Data
01/01/2009 3:00:02 AM, I, 1, 2, 3, 4, 5, 6
01/01/2009 3:00:02 AM, II, 9, 8, 7, 6, 5, 4
01/01/2009 3:00:04 AM, I, 1, 2, 3, 4, 5, 6
01/01/2009 3:00:04 AM, II, 9, 8, 7, 6, 5, 4
like image 36
Edwin Fine Avatar answered Jan 14 '23 14:01

Edwin Fine