Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Decoding a Mail.app e-mail attachment filename in Java

I've got a problem with decoding the filename of an e-mail attachment. Currently I'm using JavaMail 1.4.2. The file is named "Żółw.rtf" (that's polish for Turtle.rtf). The mail is sent using Mail.app (which seems to be quite significant). The important headers are:

--Apple-Mail-19-721116558
Content-Disposition: attachment;
   filename*=utf-8''Z%CC%87o%CC%81%C5%82w.rtf
Content-Type: text/rtf;
   x-unix-mode=0644;
   name="=?utf-8?Q?Z=CC=87o=CC=81=C5=82w=2Ertf?="
Content-Transfer-Encoding: 7bit

The corresponding javax.mail.Part.getFileName() returns "=?utf-8?Q?Z=CC=87o=CC=81=C5=82w=2Ertf?=", which, after applying MimeUtility.decodeText, is: "Żółw.rtf". Clearly not the original :).

For comparison, MimeUtility.encodeText returns:

=?UTF-8?Q?=C5=BB=C3=B3=C5=82w.rtf?=

in contrast to:

=?utf-8?Q?Z=CC=87o=CC=81=C5=82w=2Ertf?=

coming from the e-mail.

According to my research, the letter "Ż" can be encoded in two ways: either as a single letter or as "Z" + above-dot. MimeUtility.encodeText uses the former, Mail.app the latter.

However I want to be able to decode both. Is there a way to decode the filename when sent from Mail.app using JavaMail? Or maybe there is some other library?

Thanks! Adam

like image 861
adamw Avatar asked Apr 20 '11 11:04

adamw


1 Answers

Turns out you have to normalize the string:

String decoded = MimeUtility.decodeText(part.getFileName()); 
return Normalizer.normalize(decoded, Normalizer.Form.NFC); 

Weird, but works! :) In more details, as Mail.app encodes "Ż" as two characters: "Z" + "dot-above", this then has to be recombined using the Normalizer.

Adam

like image 96
adamw Avatar answered Sep 19 '22 10:09

adamw