I need advice for imap folder encoding.
I am created by my mail client (Thunderbird) imap folder with russian symbols.
Folder name is - Проверка
Folder name on filesystem is - user.mylogin.&BB8EQAQ+BDIENQRABDoEMA-
I wrote this code for convert (perl v5.10.1)
use strict;
use warnings;
use utf8;
use Encode::IMAPUTF7;
my $folder=$ARGV[1];
binmode(STDOUT,':utf8');
if ($ARGV[0] eq 'to')
{ print Encode::IMAPUTF7::encode('IMAP-UTF-7', $folder) }
elsif ($ARGV[0] eq 'from')
{ print Encode::IMAPUTF7::decode('IMAP-UTF-7', $folder) }
print "\n";
Try convert folder name to russian
[w@pandora6 tmp]$ ./imapfolder.pl from '&BB8EQAQ+BDIENQRABDoEMA-'
Проверка
All work fine
Try reverse convert
[w@pandora6 tmp]$ ./imapfolder.pl to Проверка
&ANAAnwDRAIAA0AC+ANAAsgDQALUA0QCAANAAugDQALA-
Hmm.. i am expect &BB8EQAQ+BDIENQRABDoEMA-
Ok, encode back
[w@pandora6 tmp]$ ./imapfolder.pl from '&ANAAnwDRAIAA0AC+ANAAsgDQALUA0QCAANAAugDQALA-'
ÐÑовеÑка
WTF? I expected Проверка
What went wrong?
You have been caught by one of the many gotchas of Unicode in Perl. use utf8 only turns on UTF-8 syntax. That means things like constant strings, variable names and function names will be in UTF-8. Everything else will not. Specifically the strings in @ARGV will not be UTF-8. Those will still be plain bytes.
Fortunately there is a simple fix. Use utf8::all. This will turn on all of the UTF-8 features you'd expect use utf8 to do.
Makes @ARGV encoded in UTF-8 (when utf8::all is used from the main package).
Filehandles are opened with UTF-8 encoding turned on by default (including STDIN, STDOUT, STDERR). If you don't want UTF-8 for a particular filehandle, you'll have to set binmode $filehandle.
charnames are imported so \N{...} sequences can be used to compile Unicode characters based on names.
readdir now returns UTF-8 characters instead of bytes.
glob and the <> operator now return UTF-8 characters instead of bytes.
Your code is reduced to...
use strict;
use warnings;
use utf8::all;
use Encode::IMAPUTF7;
my $folder=$ARGV[1];
if ($ARGV[0] eq 'to') {
print Encode::IMAPUTF7::encode('IMAP-UTF-7', $folder)
}
elsif ($ARGV[0] eq 'from') {
print Encode::IMAPUTF7::decode('IMAP-UTF-7', $folder)
}
print "\n";
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With