I'm trying to use Process.Start
with redirected I/O to call PowerShell.exe
with a string, and to get the output back, all in UTF-8. But I don't seem to be able to make this work.
What I've tried:
-Command
parameterConsole.OutputEncoding
in both my console application and in the PowerShell script$OutputEncoding
in PowerShellProcess.StartInfo.StandardOutputEncoding
Encoding.Unicode
instead of Encoding.UTF8
In every case, when I inspect the bytes I'm given, I get different values to my original string. I'd really love an explanation as to why this doesn't work.
Here is my code:
static void Main(string[] args) { DumpBytes("Héllo"); ExecuteCommand("PowerShell.exe", "-Command \"$OutputEncoding = [System.Text.Encoding]::UTF8 ; Write-Output 'Héllo';\"", Environment.CurrentDirectory, DumpBytes, DumpBytes); Console.ReadLine(); } static void DumpBytes(string text) { Console.Write(text + " " + string.Join(",", Encoding.UTF8.GetBytes(text).Select(b => b.ToString("X")))); Console.WriteLine(); } static int ExecuteCommand(string executable, string arguments, string workingDirectory, Action<string> output, Action<string> error) { try { using (var process = new Process()) { process.StartInfo.FileName = executable; process.StartInfo.Arguments = arguments; process.StartInfo.WorkingDirectory = workingDirectory; process.StartInfo.UseShellExecute = false; process.StartInfo.CreateNoWindow = true; process.StartInfo.RedirectStandardOutput = true; process.StartInfo.RedirectStandardError = true; process.StartInfo.StandardOutputEncoding = Encoding.UTF8; process.StartInfo.StandardErrorEncoding = Encoding.UTF8; using (var outputWaitHandle = new AutoResetEvent(false)) using (var errorWaitHandle = new AutoResetEvent(false)) { process.OutputDataReceived += (sender, e) => { if (e.Data == null) { outputWaitHandle.Set(); } else { output(e.Data); } }; process.ErrorDataReceived += (sender, e) => { if (e.Data == null) { errorWaitHandle.Set(); } else { error(e.Data); } }; process.Start(); process.BeginOutputReadLine(); process.BeginErrorReadLine(); process.WaitForExit(); outputWaitHandle.WaitOne(); errorWaitHandle.WaitOne(); return process.ExitCode; } } } catch (Exception ex) { throw new Exception(string.Format("Error when attempting to execute {0}: {1}", executable, ex.Message), ex); } }
I found that if I make this script:
[Console]::OutputEncoding = [System.Text.Encoding]::UTF8 Write-Host "Héllo!" [Console]::WriteLine("Héllo")
Then invoke it via:
ExecuteCommand("PowerShell.exe", "-File C:\\Users\\Paul\\Desktop\\Foo.ps1", Environment.CurrentDirectory, DumpBytes, DumpBytes);
The first line is corrupted, but the second isn't:
H?llo! 48,EF,BF,BD,6C,6C,6F,21 Héllo 48,C3,A9,6C,6C,6F
This suggests to me that my redirection code is all working fine; when I use Console.WriteLine
in PowerShell I get UTF-8 as I expect.
This means that PowerShell's Write-Output
and Write-Host
commands must be doing something different with the output, and not simply calling Console.WriteLine
.
I've even tried the following to force the PowerShell console code page to UTF-8, but Write-Host
and Write-Output
continue to produce broken results while [Console]::WriteLine
works.
$sig = @' [DllImport("kernel32.dll")] public static extern bool SetConsoleCP(uint wCodePageID); [DllImport("kernel32.dll")] public static extern bool SetConsoleOutputCP(uint wCodePageID); '@ $type = Add-Type -MemberDefinition $sig -Name Win32Utils -Namespace Foo -PassThru $type::SetConsoleCP(65001) $type::SetConsoleOutputCP(65001) Write-Host "Héllo!" & chcp # Tells us 65001 (UTF-8) is being used
PowerShell uses a Unicode character set by default. However, several cmdlets have an Encoding parameter that can specify encoding for a different character set. This parameter allows you to choose the specific the character encoding you need for interoperability with other systems and applications.
Click Tools, then select Web options. Go to the Encoding tab. In the dropdown for Save this document as: choose Unicode (UTF-8). Click Ok.
BOM = Byte-Order Mark.
Not an expert on encoding, but after reading these...
... it seems fairly clear that the $OutputEncoding variable only affects data piped to native applications.
If sending to a file from withing PowerShell, the encoding can be controlled by the -encoding
parameter on the out-file
cmdlet e.g.
write-output "hello" | out-file "enctest.txt" -encoding utf8
Nothing else you can do on the PowerShell front then, but the following post may well help you:.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With