Consider this program:
{$APPTYPE CONSOLE}
begin
Writeln('АБВГДЕЖЅZЗИІКЛМНОПҀРСТȢѸФХѾЦЧШЩЪЫЬѢѤЮѦѪѨѬѠѺѮѰѲѴ');
end.
The output on my console which uses the Consolas font is:
????????Z??????????????????????????????????????
The Windows console is quite capable of supporting Unicode as evidenced by this program:
{$APPTYPE CONSOLE}
uses
Winapi.Windows;
const
Text = 'АБВГДЕЖЅZЗИІКЛМНОПҀРСТȢѸФХѾЦЧШЩЪЫЬѢѤЮѦѪѨѬѠѺѮѰѲѴ';
var
NumWritten: DWORD;
begin
WriteConsole(GetStdHandle(STD_OUTPUT_HANDLE), PChar(Text), Length(Text), NumWritten, nil);
end.
for which the output is:
АБВГДЕЖЅZЗИІКЛМНОПҀРСТȢѸФХѾЦЧШЩЪЫЬѢѤЮѦѪѨѬѠѺѮѰѲѴ
Can Writeln
be persuaded to respect Unicode, or is it inherently crippled?
Just set the console output codepage through the SetConsoleOutputCP()
routine with codepage cp_UTF8
.
program Project1;
{$APPTYPE CONSOLE}
uses
System.SysUtils,Windows;
Const
Text = 'АБВГДЕЖЅZЗИІКЛМНОПҀРСТȢѸФХѾЦЧШЩЪЫЬѢѤЮѦѪѨѬѠѺѮѰѲѴ';
VAR
NumWritten: DWORD;
begin
ReadLn; // Make sure Consolas font is selected
try
WriteConsole(GetStdHandle(STD_OUTPUT_HANDLE), PChar(Text), Length(Text), NumWritten, nil);
SetConsoleOutputCP(CP_UTF8);
WriteLn;
WriteLn('АБВГДЕЖЅZЗИІКЛМНОПҀРСТȢѸФХѾЦЧШЩЪЫЬѢѤЮѦѪѨѬѠѺѮѰѲѴ');
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
ReadLn;
end.
Outputs:
АБВГДЕЖЅZЗИІКЛМНОПҀРСТȢѸФХѾЦЧШЩЪЫЬѢѤЮѦѪѨѬѠѺѮѰѲѴ
АБВГДЕЖЅZЗИІКЛМНОПҀРСТȢѸФХѾЦЧШЩЪЫЬѢѤЮѦѪѨѬѠѺѮѰѲѴ
WriteLn()
translates Unicode UTF16 strings to the selected output codepage (cp_UTF8) internally.
Update:
The above works in Delphi-XE2 and above. In Delphi-XE you need an explicit conversion to UTF-8 to make it work properly.
WriteLn(UTF8String('АБВГДЕЖЅZЗИІКЛМНОПҀРСТȢѸФХѾЦЧШЩЪЫЬѢѤЮѦѪѨѬѠѺѮѰѲѴ'));
Addendum:
If an output to the console is done in another codepage before calling SetConsoleOutputCP(cp_UTF8)
,
the OS will not correctly output text in utf-8
.
This can be fixed by closing/reopening the stdout handler.
Another option is to declare a new text output handler for utf-8
.
var
toutUTF8: TextFile;
...
SetConsoleOutputCP(CP_UTF8);
AssignFile(toutUTF8,'',cp_UTF8); // Works in XE2 and above
Rewrite(toutUTF8);
WriteLn(toutUTF8,'АБВГДЕЖЅZЗИІКЛМНОПҀРСТȢѸФХѾЦЧШЩЪЫЬѢѤЮѦѪѨѬѠѺѮѰѲѴ');
The System
unit declares a variable named AlternateWriteUnicodeStringProc
that allows customisation of how Writeln
performs output. This program:
{$APPTYPE CONSOLE}
uses
Winapi.Windows;
function MyAlternateWriteUnicodeStringProc(var t: TTextRec; s: UnicodeString): Pointer;
var
NumberOfCharsWritten, NumOfBytesWritten: DWORD;
begin
Result := @t;
if t.Handle = GetStdHandle(STD_OUTPUT_HANDLE) then
WriteConsole(t.Handle, Pointer(s), Length(s), NumberOfCharsWritten, nil)
else
WriteFile(t.Handle, Pointer(s)^, Length(s)*SizeOf(WideChar), NumOfBytesWritten, nil);
end;
var
UserFile: Text;
begin
AlternateWriteUnicodeStringProc := MyAlternateWriteUnicodeStringProc;
Writeln('АБВГДЕЖЅZЗИІКЛМНОПҀРСТȢѸФХѾЦЧШЩЪЫЬѢѤЮѦѪѨѬѠѺѮѰѲѴ');
Readln;
end.
produces this output:
АБВГДЕЖЅZЗИІКЛМНОПҀРСТȢѸФХѾЦЧШЩЪЫЬѢѤЮѦѪѨѬѠѺѮѰѲѴ
I'm sceptical of how I've implemented MyAlternateWriteUnicodeStringProc
and how it would interact with classic Pascal I/O. However, it appears to behave as desired for output to the console.
The documentation of AlternateWriteUnicodeStringProc
currently says, wait for it, ...
Embarcadero Technologies does not currently have any additional information. Please help us document this topic by using the Discussion page!
WriteConsoleW
seems to be a quite magical function.
procedure WriteLnToConsoleUsingWriteFile(CP: Cardinal; AEncoding: TEncoding; const S: string);
var
Buffer: TBytes;
NumWritten: Cardinal;
begin
Buffer := AEncoding.GetBytes(S);
// This is a side effect and should be avoided ...
SetConsoleOutputCP(CP);
WriteFile(GetStdHandle(STD_OUTPUT_HANDLE), Buffer[0], Length(Buffer), NumWritten, nil);
WriteLn;
end;
procedure WriteLnToConsoleUsingWriteConsole(const S: string);
var
NumWritten: Cardinal;
begin
WriteConsole(GetStdHandle(STD_OUTPUT_HANDLE), PChar(S), Length(S), NumWritten, nil);
WriteLn;
end;
const
Text = 'АБВГДЕЖЅZЗИІКЛМНОПҀРСТȢѸФХѾЦЧШЩЪЫЬѢѤЮѦѪѨѬѠѺѮѰѲѴ';
begin
ReadLn; // Make sure Consolas font is selected
// Works, but changing the console CP is neccessary
WriteLnToConsoleUsingWriteFile(CP_UTF8, TEncoding.UTF8, Text);
// Doesn't work
WriteLnToConsoleUsingWriteFile(1200, TEncoding.Unicode, Text);
// This does and doesn't need the CP anymore
WriteLnToConsoleUsingWriteConsole(Text);
ReadLn;
end.
So in summary:
WriteConsoleW(GetStdHandle(STD_OUTPUT_HANDLE), ...)
supports UTF-16.
WriteFile(GetStdHandle(STD_OUTPUT_HANDLE), ...)
doesn't support UTF-16.
My guess would be that in order to support different ANSI encodings the classic Pascal I/O uses the WriteFile
call.
Also keep in mind that when used on a file instead of the console it has to work as well:
unicode text file output differs between XE2 and Delphi 2009?
That means that blindly using WriteConsole
breaks output redirection. If you use WriteConsole
you should fall back to WriteFile
like this:
var
NumWritten: Cardinal;
Bytes: TBytes;
begin
if not WriteConsole(GetStdHandle(STD_OUTPUT_HANDLE), PChar(S), Length(S),
NumWritten, nil) then
begin
Bytes := TEncoding.UTF8.GetBytes(S);
WriteFile(GetStdHandle(STD_OUTPUT_HANDLE), Bytes[0], Length(Bytes),
NumWritten, nil);
end;
WriteLn;
end;
Note that output redirection with any encoding works fine in cmd.exe
. It just writes the output stream to the file unchanged.
PowerShell however expects either ANSI output or the correct preamble (/ BOM) has to be included at the start of the output (or the file will be malencoded!). Also PowerShell will always convert the output into UTF-16 with preamble.
MSDN recommends using GetConsoleMode
to find out if the standard handle is a console handle, also the BOM is mentioned:
WriteConsole fails if it is used with a standard handle that is redirected to a file. If an application processes multilingual output that can be redirected, determine whether the output handle is a console handle (one method is to call the GetConsoleMode function and check whether it succeeds). If the handle is a console handle, call WriteConsole. If the handle is not a console handle, the output is redirected and you should call WriteFile to perform the I/O. Be sure to prefix a Unicode plain text file with a byte order mark. For more information, see Using Byte Order Marks.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With