Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can i use XmlReader in PowerShell to stream big/huge XML files?

Tags:

powershell

xml

I have an XML of couple of gigabytes. There are no spaces in the XML.

So I wrote a little C# code to split in single files (which has some additional code to perform some stuff e.g. randomizing while testing)

using (XmlReader MyReader = XmlReader.Create(@"d:\xml\test.xml"))
            {
                while (MyReader.Read())
                {
                    switch (MyReader.NodeType)
                    {
                        case XmlNodeType.Element:
                            if (MyReader.Name == "Customer")
                            {
                                XElement el = XElement.ReadFrom(MyReader) as XElement;
                                if (el != null)
                                {
                                    custNumber = (string)el.Element("CustNumber");
                                    output = @"d:\xml\output\" + custNumber;

                                    File.WriteAllText(output, el.ToString());
                                }                                    
                            }
                            break;
                    }
                }
            }

I then parse the resulting files with PowerShell, basically because I find it easier to work with on the server while specs can change and I can on the fly change the script.

So... what is the easiest way to convert the above to PowerShell also, putting [.Net here] before everything ? would I have to read byte for byte just in the case it has "<cust" on one line and "omer>" on the next?

like image 427
edelwater Avatar asked Nov 08 '14 18:11

edelwater


People also ask

Can PowerShell parse XML?

Another way to use PowerShell to parse XML is to convert that XML to objects. The easiest way to do this is with the [xml] type accelerator. By prefixing the variable names with [xml] , PowerShell converts the original plain text XML into objects you can then work with.

How do I run an XML file in PowerShell?

One way to read an XML document in PowerShell is to typecast a variable to the type [xml]. To create this variable, we can use the Get-Content cmdlet to read all of the text in an XML document. To typecast the output of Get-Content we can simply prepend the text [xml] before the variable.

How big can an XML file be?

Even though the maximum file size is set to 100 MB, it is still possible to import an XML file larger than 100 MB via P6 Professional. The issue can be reproduced at will with the following steps: 1. In P6 Admin, set the Services --> Import / Export Options --> Maximum file size to 102 000 (102 MB).

How do I import XML into PowerShell?

Collect properties of all Windows Services on the local machine using Get-Service CmdLet. Pipeline the result from Get-Service CmdLet to the Export-Clixml CmdLet that will save resultset as an XML file. Import XML file into $ImportXML variable using Import-Clixml CmdLet. Show the resultset in the PowerShell Grid.


1 Answers

This should be pretty close to what you wanted to do in Powershell:

$f = [System.Xml.XmlReader]::create("d:\xml\test.xml")

while ($f.read())
{
    switch ($f.NodeType)
    {
        ([System.Xml.XmlNodeType]::Element) # Make sure to put this between brackets
        {
            if ($f.Name -eq "Customer")
            {
                $e = [System.Xml.Linq.XElement]::ReadFrom($f)

                if ($e -ne $null)
                {
                    $custNumber = [string] $e.Element("CustNumber")

                    $e.ToString() | Out-File -Append -FilePath ("d:\xml\output\"+$e.ToString())
                }
            }
            break
        }
    }
}
like image 129
Micky Balladelli Avatar answered Oct 06 '22 21:10

Micky Balladelli