Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting xml from UTF-16 to UTF-8 using PowerShell

What's the easiest way to convert XML from UTF16 to a UTF8 encoded file?

like image 914
David Gardiner Avatar asked Apr 15 '09 05:04

David Gardiner


People also ask

How to convert xml encoding to UTF 8?

Try using the function encoding = 'UTF-8 as below: %dw 1.0. %output application/xml encoding = 'UTF-8'

How do I convert XML to PowerShell?

Casting XML Strings to Objects Another way to use PowerShell to parse XML is to convert that XML to objects. The easiest way to do this is with the [xml] type accelerator. By prefixing the variable names with [xml] , PowerShell converts the original plain text XML into objects you can then work with.


2 Answers

This may not be the most optimal, but it works. Simply load the xml and push it back out to a file. the xml heading is lost though, so this has to be re-added.

$files = get-ChildItem "*.xml"
foreach ( $file in $files )
{
    [System.Xml.XmlDocument]$doc = new-object System.Xml.XmlDocument;
    $doc.set_PreserveWhiteSpace( $true );
    $doc.Load( $file );

    $root = $doc.get_DocumentElement();
    $xml = $root.get_outerXml();
    $xml = '<?xml version="1.0" encoding="utf-8"?>' + $xml

    $newFile = $file.Name + ".new"
    Set-Content -Encoding UTF8 $newFile $xml;
}
like image 172
Ben Laan Avatar answered Sep 22 '22 07:09

Ben Laan


Well, I guess the easiest way is to just not care about whether the file is XML or not and simply convert:

Get-Content file.foo -Encoding Unicode | Set-Content -Encoding UTF8 newfile.foo

This will only work for XML when there is no

<?xml version="1.0" encoding="UTF-16"?>

line.

like image 45
Joey Avatar answered Sep 22 '22 07:09

Joey