Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl Encode.pm cannot decode string with wide character

I was running a perl app which uses /opt/local/lib/perl5/5.12.4/darwin-thread-multi-2level/Encode.pm

and issues an error

Cannot decode string with wide characters at /opt/local/lib/perl5/5.12.4/darwin-thread-multi-2level/Encode.pm line 174.

Line 174 of Encode.pm reads

sub decode($$;$) {
    my ( $name, $octets, $check ) = @_;
    return undef unless defined $octets;
    $octets .= '' if ref $octets;
    $check ||= 0;
    my $enc = find_encoding($name);
    unless ( defined $enc ) {
        require Carp;
        Carp::croak("Unknown encoding '$name'");
    }
    my $string = $enc->decode( $octets, $check );  # line 174
    $_[1] = $octets if $check and !ref $check and !( $check & LEAVE_SRC() );
    return $string;
}

Any workaround?

like image 554
Meng Lu Avatar asked Oct 21 '12 01:10

Meng Lu


2 Answers

encode takes a string of Unicode code points and serialises them into a string of bytes.

decode takes a string of bytes and deserialises them into Unicode code points.

That message means you passed a string containing one or more characters above 255 (non-bytes) to decode, which is obviously an incorrect argument.

>perl -MEncode -E"for (254..257) { say; decode('iso-8859-1', chr($_)); }"
254
255
256
Wide character in subroutine entry at .../Encode.pm line 176.

You ask for a workaround, but the bug is yours. Perhaps you are accidentally trying to decode something you already decoded?

like image 156
ikegami Avatar answered Nov 15 '22 19:11

ikegami


I had a similar problem. $enc->decode( $octets, $check ); expects octets.

So put Encode::_utf8_off($octets) before. It made it work for me.

like image 25
Aftershock Avatar answered Nov 15 '22 18:11

Aftershock