Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Powershell saving XML and preserving format

Tags:

powershell

I want to read in an XML file and modify an element then save it back to the file. What is the best way to do this while preserving the format and also keep matching Line terminator (CRLF vs LF)?

Here is what I have but it doesn't do that:

$xml = [xml]([System.IO.File]::ReadAllText($fileName))
$xml.PreserveWhitespace = $true
# Change some element
$xml.Save($fileName)

The problem is that extra new lines (aka empty lines in the xml) are removed and after I have mixed LF and CRLF.

like image 511
Matthew M. Osborn Avatar asked Nov 17 '11 00:11

Matthew M. Osborn


4 Answers

You can use the PowerShell [xml] object and set $xml.PreserveWhitespace = $true, or do the same thing using .NET XmlDocument:

$f = '.\xml_test.xml'

# Using .NET XmlDocument
$xml = New-Object System.Xml.XmlDocument
$xml.PreserveWhitespace = $true

# Or using PS [xml] (older PowerShell versions may need to use psbase)
$xml = New-Object xml
#$xml.psbase.PreserveWhitespace = $true  # Older PS versions
$xml.PreserveWhitespace = $true

# Load with preserve setting
$xml.Load($f)
$n = $xml.SelectSingleNode('//file')
$n.InnerText = 'b'
$xml.Save($f)

Just make sure to set PreserveWhitespace before calling XmlDocument.Load or XmlDocument.LoadXml.

NOTE: This does not preserve white space between XML attributes! White space in XML attributes seem to be preserved, but not between. The documentation talks about preserving "white space nodes" (node.NodeType = System.Xml.XmlNodeType.Whitespace) and not attributes.

like image 167
Ryan Avatar answered Nov 03 '22 08:11

Ryan


If you would like to correct the CRLF that gets transformed to LF for text nodes after you call the Save method on the XmlDocument you can use a XmlWriterSettings instance. Uses the same XmlWriter as MilesDavies192s answer but also changes encoding to utf-8 and keeps indentation.

$xml = [xml]([System.IO.File]::ReadAllText($fileName))
$xml.PreserveWhitespace = $true

# Change some element

#Settings object will instruct how the xml elements are written to the file
$settings = New-Object System.Xml.XmlWriterSettings
$settings.Indent = $true
#NewLineChars will affect all newlines
$settings.NewLineChars ="`r`n"
#Set an optional encoding, UTF-8 is the most used (without BOM)
$settings.Encoding = New-Object System.Text.UTF8Encoding( $false )

$w = [System.Xml.XmlWriter]::Create($fileName, $settings)
try{
    $xml.Save( $w )
} finally{
    $w.Dispose()
}
like image 23
Dan Avatar answered Nov 03 '22 09:11

Dan


When reading xml the empty lines ignored by default, in order to preserve them you can change PreserveWhitespace property before reading the file:

Create XmlDocument object and configure PreserveWhitespace:

$xmlDoc = [xml]::new()
$xmlDoc.PreserveWhitespace = $true

Load the document:

$xmlDoc.Load($myFilePath)

or

$xmlDoc.LoadXml($(Get-Content $myFilePath -Raw))
like image 5
arielhad Avatar answered Nov 03 '22 08:11

arielhad


If you save using an XmlWriter the default options are to indent with two spaces and to replace the line endings with CR/LF. You can configure these options after creating the writer or create the writer with an XmlSettings object configured with your needs.

    $fileXML = New-Object System.Xml.XmlDocument

    # Try and read the file as XML. Let the errors go if it's not.
    [void]$fileXML.Load($file)

    $writerXML = [System.Xml.XmlWriter]::Create($file)
    $fileXML.Save($writerXML)
like image 2
MilesDavies192 Avatar answered Nov 03 '22 08:11

MilesDavies192