Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge multiple XML files into one using PowerShell 2.0?

I have a directory of very large XML files with a structure as this:

file1.xml:

<root>
 <EmployeeInfo attr="one" />
 <EmployeeInfo attr="two" />
 <EmployeeInfo attr="three" />
</root>

file2.xml:

<root>
 <EmployeeInfo attr="four" />
 <EmployeeInfo attr="five" />
 <EmployeeInfo attr="six" />
</root>

Now I am looking for a simple way to merge these files (*.xml) files into one output file:

<root>
 <EmployeeInfo attr="one" />
 <EmployeeInfo attr="two" />
 <EmployeeInfo attr="three" />
 <EmployeeInfo attr="four" />
 <EmployeeInfo attr="five" />
 <EmployeeInfo attr="six" />
</root>

I was thinking about using pure XSLT such as this one:

<xsl:transform version="1.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/">
    <Container>
      <xsl:copy-of select="document('file1.xml')"/>
      <xsl:copy-of select="document('file2.xml')"/>        
    </Container>
  </xsl:template>
</xsl:stylesheet>

This works but isn't as flexible as I want. Being a novice with PowerShell (version 2) eager to learn new best pracctices of working with XML in PowerShell I am wondering what is the simplest, purest PowerShell way of merging the structre of XML documents into one?

Cheers, Joakim

like image 463
Yooakim Avatar asked Jun 04 '10 07:06

Yooakim


2 Answers

While the XSLT way to do this is pretty short, so is the PowerShell way:

$finalXml = "<root>"
foreach ($file in $files) {
    [xml]$xml = Get-Content $file    
    $finalXml += $xml.InnerXml
}
$finalXml += "</root>"
([xml]$finalXml).Save("$pwd\final.xml")

Hope this helps,

like image 154
Start-Automating Avatar answered Nov 29 '22 05:11

Start-Automating


Personally I would not use PowerShell for such a task.

Typically you use PowerShell to accessing config files like this

$config = [xml](gc web.config)

then you can work with the xml like with objects. Pretty cool. If you need to process large xml structures, then using [xml] (which is equivalent to XmlDocument) is quite memory expensive.

However, that's almost everything how PowerShell supports xml (get-command *xml* -CommandType cmdlet will give you all xml like commands).
It is of course possible to use .NET classes for xml operations, but that code won't be as pretty as true PowerShell approach. So, for your task you would need to use some readers/writers for that, which is imho not worthy doing.

That's why I think xslt is better approach ;) If you need to be flexible, you can generate the xlst template during script execution or just replace the file names, that's no problem.

like image 26
stej Avatar answered Nov 29 '22 05:11

stej