Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UTF-8 output from PowerShell

I'm trying to use Process.Start with redirected I/O to call PowerShell.exe with a string, and to get the output back, all in UTF-8. But I don't seem to be able to make this work.

What I've tried:

  • Passing the command to run via the -Command parameter
  • Writing the PowerShell script as a file to disk with UTF-8 encoding
  • Writing the PowerShell script as a file to disk with UTF-8 with BOM encoding
  • Writing the PowerShell script as a file to disk with UTF-16
  • Setting Console.OutputEncoding in both my console application and in the PowerShell script
  • Setting $OutputEncoding in PowerShell
  • Setting Process.StartInfo.StandardOutputEncoding
  • Doing it all with Encoding.Unicode instead of Encoding.UTF8

In every case, when I inspect the bytes I'm given, I get different values to my original string. I'd really love an explanation as to why this doesn't work.

Here is my code:

static void Main(string[] args) {     DumpBytes("Héllo");      ExecuteCommand("PowerShell.exe", "-Command \"$OutputEncoding = [System.Text.Encoding]::UTF8 ; Write-Output 'Héllo';\"",         Environment.CurrentDirectory, DumpBytes, DumpBytes);      Console.ReadLine(); }  static void DumpBytes(string text) {     Console.Write(text + " " + string.Join(",", Encoding.UTF8.GetBytes(text).Select(b => b.ToString("X"))));     Console.WriteLine(); }  static int ExecuteCommand(string executable, string arguments, string workingDirectory, Action<string> output, Action<string> error) {     try     {         using (var process = new Process())         {             process.StartInfo.FileName = executable;             process.StartInfo.Arguments = arguments;             process.StartInfo.WorkingDirectory = workingDirectory;             process.StartInfo.UseShellExecute = false;             process.StartInfo.CreateNoWindow = true;             process.StartInfo.RedirectStandardOutput = true;             process.StartInfo.RedirectStandardError = true;             process.StartInfo.StandardOutputEncoding = Encoding.UTF8;             process.StartInfo.StandardErrorEncoding = Encoding.UTF8;              using (var outputWaitHandle = new AutoResetEvent(false))             using (var errorWaitHandle = new AutoResetEvent(false))             {                 process.OutputDataReceived += (sender, e) =>                 {                     if (e.Data == null)                     {                         outputWaitHandle.Set();                     }                     else                     {                         output(e.Data);                     }                 };                  process.ErrorDataReceived += (sender, e) =>                 {                     if (e.Data == null)                     {                         errorWaitHandle.Set();                     }                     else                     {                         error(e.Data);                     }                 };                  process.Start();                  process.BeginOutputReadLine();                 process.BeginErrorReadLine();                  process.WaitForExit();                 outputWaitHandle.WaitOne();                 errorWaitHandle.WaitOne();                  return process.ExitCode;             }         }     }     catch (Exception ex)     {         throw new Exception(string.Format("Error when attempting to execute {0}: {1}", executable, ex.Message),             ex);     } } 

Update 1

I found that if I make this script:

[Console]::OutputEncoding = [System.Text.Encoding]::UTF8 Write-Host "Héllo!" [Console]::WriteLine("Héllo") 

Then invoke it via:

ExecuteCommand("PowerShell.exe", "-File C:\\Users\\Paul\\Desktop\\Foo.ps1",   Environment.CurrentDirectory, DumpBytes, DumpBytes); 

The first line is corrupted, but the second isn't:

H?llo! 48,EF,BF,BD,6C,6C,6F,21 Héllo 48,C3,A9,6C,6C,6F 

This suggests to me that my redirection code is all working fine; when I use Console.WriteLine in PowerShell I get UTF-8 as I expect.

This means that PowerShell's Write-Output and Write-Host commands must be doing something different with the output, and not simply calling Console.WriteLine.

Update 2

I've even tried the following to force the PowerShell console code page to UTF-8, but Write-Host and Write-Output continue to produce broken results while [Console]::WriteLine works.

$sig = @' [DllImport("kernel32.dll")] public static extern bool SetConsoleCP(uint wCodePageID);  [DllImport("kernel32.dll")] public static extern bool SetConsoleOutputCP(uint wCodePageID); '@  $type = Add-Type -MemberDefinition $sig -Name Win32Utils -Namespace Foo -PassThru  $type::SetConsoleCP(65001) $type::SetConsoleOutputCP(65001)  Write-Host "Héllo!"  & chcp    # Tells us 65001 (UTF-8) is being used 
like image 648
Paul Stovell Avatar asked Mar 12 '14 10:03

Paul Stovell


People also ask

How do I set encoding in PowerShell?

PowerShell uses a Unicode character set by default. However, several cmdlets have an Encoding parameter that can specify encoding for a different character set. This parameter allows you to choose the specific the character encoding you need for interoperability with other systems and applications.

How do I change the encoding to UTF-8?

Click Tools, then select Web options. Go to the Encoding tab. In the dropdown for Save this document as: choose Unicode (UTF-8). Click Ok.

What is BOM in PowerShell?

BOM = Byte-Order Mark.


1 Answers

Not an expert on encoding, but after reading these...

  • http://blogs.msdn.com/b/powershell/archive/2006/12/11/outputencoding-to-the-rescue.aspx
  • http://technet.microsoft.com/en-us/library/hh847796.aspx
  • http://www.johndcook.com/blog/2008/08/25/powershell-output-redirection-unicode-or-ascii/

... it seems fairly clear that the $OutputEncoding variable only affects data piped to native applications.

If sending to a file from withing PowerShell, the encoding can be controlled by the -encoding parameter on the out-file cmdlet e.g.

 write-output "hello" | out-file "enctest.txt" -encoding utf8 

Nothing else you can do on the PowerShell front then, but the following post may well help you:.

  • http://blogs.msdn.com/b/ddietric/archive/2010/11/08/decoding-standard-output-and-standard-error-when-redirecting-to-a-gui-application.aspx
like image 134
andyb Avatar answered Sep 18 '22 18:09

andyb