The utf8 pragma and utf8 encodings on filehandles have me confused. For example, this apparently straightforward code...
use utf8;
print qq[fü];
To be clear, the hex dump on "fü" is 66 c3 bc
which if I'm not mistaken is proper UTF8.
That prints 66 fc
which is not UTF8 but Unicode or maybe Latin-1. Turn off use utf8
and I get 66 c3 bc
. This is the opposite of what I'd expect.
Now let's add in filehandle pramgas.
use utf8;
binmode *STDOUT, ':encoding(utf8)';
print qq[fü];
Now I get 66 c3 bc
. But remove use utf8
and I get 66 c3 83 c2 bc
which doesn't make any sense to me.
What's the right thing to do to make my code DWIM with UTF8?
PS My locale is set to "en_US.UTF-8" and Perl 5.10.1.
While Perl does not implement the Unicode standard or the accompanying technical reports from cover to cover, Perl does support many Unicode features. Also, the use of Unicode may present security issues that aren't obvious, see "Security Implications of Unicode" below.
$octets = encode(ENCODING, $string [, CHECK]) Encodes a string from Perl's internal form into ENCODING and returns a sequence of octets. ENCODING can be either a canonical name or an alias. For encoding names and aliases, see Defining Aliases. For CHECK, see Handling Malformed Data.
UTF-8 (UCS Transformation Format 8) is the World Wide Web's most common character encoding. Each character is represented by one to four bytes. UTF-8 is backward-compatible with ASCII and can represent any standard Unicode character.
UTF-8 is an encoding system for Unicode. It can translate any Unicode character to a matching unique binary string, and can also translate the binary string back to a Unicode character. This is the meaning of “UTF”, or “Unicode Transformation Format.”
use utf8;
states that your source code is encoded in UTF8. By adding
binmode *STDOUT, ':encoding(utf8)';
print qq[fü];
you are asking that the script's output be encoded in UTF8 as well.
If you had written
print "f\x{00FC}\n";
you would not have needed use utf8;
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With