I've got a problem with decoding the filename of an e-mail attachment. Currently I'm using JavaMail 1.4.2. The file is named "Żółw.rtf" (that's polish for Turtle.rtf). The mail is sent using Mail.app (which seems to be quite significant). The important headers are:
--Apple-Mail-19-721116558
Content-Disposition: attachment;
filename*=utf-8''Z%CC%87o%CC%81%C5%82w.rtf
Content-Type: text/rtf;
x-unix-mode=0644;
name="=?utf-8?Q?Z=CC=87o=CC=81=C5=82w=2Ertf?="
Content-Transfer-Encoding: 7bit
The corresponding javax.mail.Part.getFileName() returns "=?utf-8?Q?Z=CC=87o=CC=81=C5=82w=2Ertf?=", which, after applying MimeUtility.decodeText, is: "Żółw.rtf". Clearly not the original :).
For comparison, MimeUtility.encodeText returns:
=?UTF-8?Q?=C5=BB=C3=B3=C5=82w.rtf?=
in contrast to:
=?utf-8?Q?Z=CC=87o=CC=81=C5=82w=2Ertf?=
coming from the e-mail.
According to my research, the letter "Ż" can be encoded in two ways: either as a single letter or as "Z" + above-dot. MimeUtility.encodeText uses the former, Mail.app the latter.
However I want to be able to decode both. Is there a way to decode the filename when sent from Mail.app using JavaMail? Or maybe there is some other library?
Thanks! Adam
Turns out you have to normalize the string:
String decoded = MimeUtility.decodeText(part.getFileName());
return Normalizer.normalize(decoded, Normalizer.Form.NFC);
Weird, but works! :) In more details, as Mail.app encodes "Ż" as two characters: "Z" + "dot-above", this then has to be recombined using the Normalizer.
Adam
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With