Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Powershell- Convert Complex XML to CSV

I want to convert the below XML to CSV. Though it has header line information in the XML file.

<?xml version="1.0" encoding="UTF-8"?>
<myfile>
  <updatetime>2019-07-30 08:30:30</updatetime>
  <process code="PRS1234" name="PROCESS1234" />
  <equipment code="EQP1234" name="EQUIPMENT1234" />
  <product type="equipment" planned="300" time="36000" cycletime="20" />
  <shift code="1" timestart="2019-07-30 02:00:00">
    <order index="1" goodproduct="500" defectproduct="5" time="2019-07-30 02:00:00" />
    <order index="2" goodproduct="980" defectproduct="7" time="2019-07-30 03:00:00" />
    <order index="3" goodproduct="1200" defectproduct="12" time="2019-07-30 04:00:00" />
    <order index="4" goodproduct="1800" defectproduct="15" time="2019-07-30 05:00:00" />
    <order index="5" goodproduct="2500" defectproduct="15" time="2019-07-30 06:00:00" />
  <shift>
  <shift code="2" timestart="2019-07-30 07:00:00">
    <order index="1" goodproduct="600" defectproduct="5" time="2019-07-30 07:00:00" />
    <order index="2" goodproduct="980" defectproduct="7" time="2019-07-30 08:00:00" />
    <order index="3" goodproduct="1500" defectproduct="8" time="2019-07-30 09:00:00" />
    <order index="4" goodproduct="1700" defectproduct="11" time="2019-07-30 10:00:00" />
    <order index="5" goodproduct="3000" defectproduct="15" time="2019-07-30 11:00:00" />
  </shift>
</myfile>

I can get values for the desired nodes. This is what I have done.

[xml]$inputFile = Get-Content "Q:\XML\FileComplex.xml"
$inputFile.myfile.product | Select-Object -Property type,planned,time,cycletime | ConvertTo-Csv -NoTypeInformation -Delimiter ";" | Set-Content -Path "Q:\XML\FileComplex.csv" -Encoding UTF8 "

What I'm trying to achieve is to combine all desired information together and to get it as one record of the CSV file, which is like this

updatetime          | code(process) | name(process) | code(equipment) | name(equipment) | type(product) | planned(product) | time(product) | cycletime(product) | goodproduct(shift(code) is 1 and index is max) | defectproduct(shift(code) is 1 and index is max) | goodproduct(shift(code) is 2 and index is max) | defectproduct((shift(code) is 2 where index is max)
2019-07-30 08:30:30 | PRS1234       | PROCESS1234   | EQP1234         | EQUIPMENT1234   | equipment     | 300              | 36000         | 20                 |                               2500             |                                               15 |                                           3000 |                                       15

I really appreciate your support!!

Thanks in advance Natasha

like image 561
Natasha Perera Avatar asked Jan 27 '23 00:01

Natasha Perera


2 Answers

I would solve this using SelectSingleNode() and XPath.

$data = New-Object xml;
$data.load(".\myfile.xml")

$record = [pscustomobject]@{
    "updatetime"             = $data.SelectSingleNode("/*/updatetime")."#text"
    "code(process)"          = $data.SelectSingleNode("/*/process").code
    "name(process)"          = $data.SelectSingleNode("/*/process").name
    "code(equipment)"        = $data.SelectSingleNode("/*/equipment").code
    "name(equipment)"        = $data.SelectSingleNode("/*/equipment").name
    "type(product)"          = $data.SelectSingleNode("/*/product").type
    "planned(product)"       = $data.SelectSingleNode("/*/product").planned
    "time(product)"          = $data.SelectSingleNode("/*/product").time
    "cycletime(product)"     = $data.SelectSingleNode("/*/product").cycletime
    "goodproduct(shift 1)"   = $data.SelectSingleNode("/*/shift[@code = 1]/order[not(@index < ../order/@index)]").goodproduct
    "defectproduct(shift 1)" = $data.SelectSingleNode("/*/shift[@code = 1]/order[not(@index < ../order/@index)]").defectproduct
    "goodproduct(shift 2)"   = $data.SelectSingleNode("/*/shift[@code = 2]/order[not(@index < ../order/@index)]").goodproduct
    "defectproduct(shift 2)" = $data.SelectSingleNode("/*/shift[@code = 2]/order[not(@index < ../order/@index)]").defectproduct
}

$record
#$record | ConvertTo-Csv -NoTypeInformation

Explanation of order[not(@index < ../order/@index)]: "Any <order> whose index is not less than the index of any other <order> next to it." - The only <order> for which this condition is true is the one with the max index.

Output is like this (before conversion to CSV)

updatetime             : 2019-07-30 08:30:30
code(process)          : PRS1234
name(process)          : PROCESS1234
code(equipment)        : EQP1234
name(equipment)        : EQUIPMENT1234
type(product)          : equipment
planned(product)       : 300
time(product)          : 36000
cycletime(product)     : 20
goodproduct(shift 1)   : 2500
defectproduct(shift 1) : 15
goodproduct(shift 2)   : 3000
defectproduct(shift 2) : 15
like image 71
Tomalak Avatar answered Feb 04 '23 13:02

Tomalak


OP's XML doesn't appear to be valid. Assuming the XML is meant to be...

<?xml version="1.0" encoding="UTF-8"?>
<myfile>
    <updatetime>2019-07-30 08:30:30</updatetime>
    <process code="PRS1234" name="PROCESS1234" />
    <equipment code="EQP1234" name="EQUIPMENT1234" />
    <product type="equipment" planned="300" time="36000" cycletime="20" />
    <shift code="1" timestart="2019-07-30 02:00:00">
        <order index="1" goodproduct="500" defectproduct="5" time="2019-07-30 02:00:00" />
        <order index="2" goodproduct="980" defectproduct="7" time="2019-07-30 03:00:00" />
        <order index="3" goodproduct="1200" defectproduct="12" time="2019-07-30 04:00:00" />
        <order index="4" goodproduct="1800" defectproduct="15" time="2019-07-30 05:00:00" />
        <order index="5" goodproduct="2500" defectproduct="15" time="2019-07-30 06:00:00" />
    </shift>
    <shift code="2" timestart="2019-07-30 07:00:00">
        <order index="1" goodproduct="600" defectproduct="5" time="2019-07-30 07:00:00" />
        <order index="2" goodproduct="980" defectproduct="7" time="2019-07-30 08:00:00" />
        <order index="3" goodproduct="1500" defectproduct="8" time="2019-07-30 09:00:00" />
        <order index="4" goodproduct="1700" defectproduct="11" time="2019-07-30 10:00:00" />
        <order index="5" goodproduct="3000" defectproduct="15" time="2019-07-30 11:00:00" />
    </shift>
</myfile>

Because PowerShell's XPath 1.0 doesn't support the max() function the following code assumes that the order elements are arranged in ascending index order and just selects the last one. If you can't guarantee the index order you'll need to come up with your own max() solution...

[xml]$xml = Get-Content "FileComplex.xml"

# PowerShell uses XPath 1.0, which doesn't support max().
# If it did, you could select: //shift[code=1]/order[@id = max(//shift[code=1]/order/@id)]
$order1 = $xml.SelectSingleNode("//shift[@code=1]/order[last()]")
$order2 = $xml.SelectSingleNode("//shift[@code=2]/order[last()]")

# Using Add-Member like this is messy, but guarantees the order of fields on the $result object...
$result = New-Object PSObject -Property $props
$result | Add-Member NoteProperty "updatetime" $xml.SelectSingleNode("//updatetime").InnerText
$result | Add-Member NoteProperty "code(process)" $xml.SelectSingleNode("//process").code
$result | Add-Member NoteProperty "name(process)" $xml.SelectSingleNode("//process").name
$result | Add-Member NoteProperty "code(equipment)" $xml.SelectSingleNode("//equipment").code
$result | Add-Member NoteProperty "name(equipment)" $xml.SelectSingleNode("//equipment").name
$result | Add-Member NoteProperty "type(product)" $xml.SelectSingleNode("//product").type
$result | Add-Member NoteProperty "planned(product)" $xml.SelectSingleNode("//product").planned
$result | Add-Member NoteProperty "time(product)" $xml.SelectSingleNode("//product").time
$result | Add-Member NoteProperty "cycletime(product)" $xml.SelectSingleNode("//product").cycletime
$result | Add-Member NoteProperty "goodproduct(shift(code) is 1 and index is max)" $order1.goodproduct
$result | Add-Member NoteProperty "defectproduct(shift(code) is 1 and index is max)" $order1.defectproduct
$result | Add-Member NoteProperty "goodproduct(shift(code) is 2 and index is max)" $order2.goodproduct
$result | Add-Member NoteProperty "defectproduct(shift(code) is 2 and index is max)" $order2.defectproduct

$result | Export-Csv -Path "FileComplex.csv" -Delimiter ';' -Encoding utf8
like image 40
AlwaysLearning Avatar answered Feb 04 '23 13:02

AlwaysLearning