Why length() says this is 4 logical characters (I would expect it to say 1):
$ perl -lwe 'print length("🐪")'
4
I guess something is wrong with my expectation. :-) What is it?
Unless you tell Perl that the source code of the script is in utf8 Perl assumes ASCII. This means that by default the Perl interpreter sees 🐪
as 4 separate characters. If you change your one liner to perl -Mutf8 -lwe 'print length("🐪")'
You see length providing your expected output.
The utf8 pragma tells Perl that the source unit is in utf8 and not ASCII. See perldoc utf8
for more info.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With