Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Turn Off" binmode(STDOUT, ":utf8") Locally

I Have The following block in the beginning of my script:

#!/usr/bin/perl5 -w
use strict;
binmode(STDIN, ":utf8");
binmode(STDOUT, ":utf8");
binmode(STDERR, ":utf8");

In some subroutines when there is other encoding(from a distant subroutine), the data will not display correctly, when receiving cyrillic or other characters. It is the "binmode", that causes the problem.

Can I "turn off" the binmode utf8 locally, for the subroutine only?

I can't remove the global binmode setting and I can't change the distant encoding.

like image 731
DanielLazarov Avatar asked Dec 25 '22 00:12

DanielLazarov


2 Answers

One way to achieve this is to "dup" the STD handle, set the duplicated filehandle to use the :raw layer, and assign it to a local version of the STD handle. For example, the following code

binmode(STDOUT, ':utf8');
print(join(', ', PerlIO::get_layers(STDOUT)), "\n");

{
    open(my $duped, '>&', STDOUT);
    # The ':raw' argument could also be omitted.
    binmode($duped, ':raw');
    local *STDOUT = $duped;
    print(join(', ', PerlIO::get_layers(STDOUT)), "\n");
    close($duped);
}

print(join(', ', PerlIO::get_layers(STDOUT)), "\n");

prints

unix, perlio, utf8
unix, perlio
unix, perlio, utf8

on my system.

like image 106
nwellnhof Avatar answered Feb 05 '23 16:02

nwellnhof


I like @nwellnhof's approach. Dealing only with Unicode and ASCII - a luxury few enjoy - my instinct would be to leave the bytes as is and selectively make use of Encode to decode()/encode() when needed. If you are able to determine which of your data sources are problematic you could filter/insert decode when dealing with them.

% file koi8r.txt 
koi8r.txt: ISO-8859 text
% cat koi8r.txt 
������ �� ����� � ������� ���. ���
���� ����� ������ ����� �����.
% perl -CO -MEncode="encode,decode" -E 'say decode("koi8-r", <>) ;' koi8r.txt
Американские суда находятся в международных водах. Япония
like image 33
G. Cito Avatar answered Feb 05 '23 16:02

G. Cito