I am encountering a strange problem in printing Unicode strings to the Windows console*.
Consider this text:
אני רוצה לישון
Intermediary
היא רוצה לישון
אתם, הם
Bye
Hello, world!
test
Assume it's in a file called "file.txt".
When I go*: "type file.txt", it prints out fine. But when it's printed from a Perl program, like this:
use strict;
use warnings;
use Encode;
use 5.014;
use utf8;
use autodie;
use warnings qw< FATAL utf8 >;
use open qw< :std :utf8 >;
use feature qw< unicode_strings >;
use warnings 'all';
binmode STDOUT, ':utf8'; # output should be in UTF-8
my $word;
my @array = ( 'אני רוצה לישון', 'Intermediary',
'היא רוצה לישון', 'אתם, הם', 'Bye','Hello, world!', 'test');
foreach $word(@array) {
say $word;
}
The Unicode lines (Hebrew in this case) show up again each time, partially broken, like this:
E:\My Documents\Technical\Perl>perl "hello unicode.pl"
אני רוצה לישון
לישון
�ן
Intermediary
היא רוצה לישון
לישון
�ן
אתם, הם
�ם
Bye
Hello, world!
test
(I save everything in UTF-8).
This is mighty strange. Any suggestions?
(It's not a "Console2" problem* - the same problem shows up on a "regular" windows console, only there you don't see the Hebrew glyphs).
* Using "Console" (also called "Console2") - it's a nice little utility which enables working with Unicode with the Windows console - see, for example, here: http://www.hanselman.com/blog/Console2ABetterWindowsCommandPrompt.aspx
** Note: at the console, you have to say, of course:
chcp 65001
Did you try the solution from perlmonk ?
It use :unix
as well to avoid the console buffer.
This is the code from that link:
use Win32::API;
binmode(STDOUT, ":unix:utf8");
#Must set the console code page to UTF8
$SetConsoleOutputCP= new Win32::API( 'kernel32.dll', 'SetConsoleOutputCP', 'N','N' );
$SetConsoleOutputCP->Call(65001);
$line1="\x{2554}".("\x{2550}"x15)."\x{2557}\n";
$line2="\x{2551}".(" "x15)."\x{2551}\n";
$line3="\x{255A}".("\x{2550}"x15)."\x{255D}";
$unicode_string=$line1.$line2.$line3;
print "THIS IS THE CORRECT EXAMPLE OUTPUT IN PURE PERL: \n";
print $unicode_string;
Guys: continuing on studying that Perlmonks post, turns out that this is even neater and nicer:
replace: use Win32::API;
and:
$SetConsoleOutputCP= new Win32::API( 'kernel32.dll', 'SetConsoleOutputCP', 'N','N' );
$SetConsoleOutputCP->Call(65001);
with:
use Win32::Console;
and:
Win32::Console::OutputCP(65001);
Leaving all else intact.
This is even more in the spirit of Perl conciseness and magic.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With