This script gives me two times the same output. Are there encoding which would not survive the utf8 encode and decode between the two say?
#!/usr/bin/env perl
use warnings;
use 5.16.1;
use Encode qw/encode decode/;
my $my_encoding = 'ISO-8859-7';
binmode STDOUT, ":encoding($my_encoding)";
my $var = "\N{GREEK SMALL LETTER TAU}";
$var .= "\N{GREEK SMALL LETTER OMEGA WITH TONOS}";
$var .= "\N{GREEK CAPITAL LETTER UPSILON WITH DIALYTIKA}";
$var = encode( 'utf8', $var );
$var = decode( $my_encoding, $var );
say $var;
my $test = encode( 'utf8', $var, Encode::FB_CROAK );
$var = decode( 'utf8', $test, Encode::FB_CROAK );
say $var;
It croaks if you try to encode something that falls outside of the target encoding's character set.
utf8 is a Perl-specific encoding used by Perl to store 72-bit characters. It is similar to UTF-8, but it is different. It supports every character Perl supports, so it will never croak.
On the other hand, if you were to use UTF-8, it would will croak if you try to encode something that isn't a Unicode character (e.g. chr(0x200000)).
See also: :encoding(UTF-8) vs :encoding(utf8) vs :utf8
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With