Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Subject encoding on SmtpClient/MailMessage

I am trying to send emails that contain non-ASCII characters using the SmtpClient and MailMessage classes.

I am using an external mailing service (MailChimp) and some of my emails have been rejected by their SMTP server. I have contacted them and this is what they replied:

It appears the subject line is being Base64 encoded and then Quoted-Printable encoded, which generally should be fine, but one of the characters is being broken across two lines. So when your subject lines are a bit longer, in order to be processed correctly, it's broken on to two lines. When using UTF-8 quoted printable in a subject line, character strings aren't supposed to be broken between lines. Instead a line should be shorted so that the full character string remains together. In this case, that's not happening, so the string of characters that represents a single character is being broken across multiple lines, and therefore isn't validly UTF-8 quoted-printable encoded.

The problematic subject is the following:

Subject: XXXXXXX - 5 personnes vous ont nommé guide

Which is, in UTF-8/Base64:

Subject: WFhYWFhYWCAtIDUgcGVyc29ubmVzIHZvdXMgb250IG5vbW3DqSBndWlkZQ==

Because that header would exceed a certain maximum length (I am unsure whether it is the Quoted-Printable encoding and its limit of 76 characters per line, or the SMTP header limit), after encoding and split, the header will become:

Subject: =?utf-8?B?WFhYWFhYWCAtIDUgcGVyc29ubmVzIHZvdXMgb250IG5vbW3D?=
 =?utf-8?B?qSBndWlkZQ==?=

Apparently this causes an issue when decoding (because the first line cannot be decoded to a valid string). I am not sure I fully understand the problem, and I have the following questions:

  • Why is the ?utf-8?B? part repeated? Shouldn't the QP encoding happen before splitting the line and thus its header shouldn't be repeated?
  • After QP-decoding, shouldn't we end up with a valid 1-line Base64 string?
  • There is a space at the start of the second line which is outside of the QP encoding, could this be the problem?
  • Is the encoder broken, or it is the decoder?

Also note that some other SMTP servers will accept this message, though that does not mean it is valid.

As a workaround, I have tried disabling the Base64 encoding, which apparently is unnecessary, however the MailMessage class has a BodyTransferEncoding property that controls this encoding, but only for the body part of the message. No property seems to control the "transfer" encoding of the subject.

like image 640
Xavier Poinas Avatar asked Apr 16 '13 04:04

Xavier Poinas


2 Answers

This was confirmed as a bug in the MSDN forums:
http://social.msdn.microsoft.com/Forums/vstudio/en-US/4d1c1752-70ba-420a-9510-8fb4aa6da046/subject-encoding-on-smtpclientmailmessage

And a bug was filed on Microsoft Connect: https://connect.microsoft.com/VisualStudio/feedback/details/785710/mailmessage-subject-incorrectly-encoded-in-utf-8-base64

One work-around is to set the SubjectEncoding of the MailMessage to an other encoding, such as ISO-8859-1. In this case, the subject will be encoded in Quoted Printable (not Base64) which avoids the problem.

like image 80
Xavier Poinas Avatar answered Sep 25 '22 17:09

Xavier Poinas


A better solution is to use Encoding.Unicode instead of Encoding.UTF8 for the SubjectEncoding.

It appears that, as the Microsoft implementation simply ignores the reality of UTF-16 being able to encode characters in more than two bytes (as seen on Why does C# use UTF-16 for strings?), the stable character size helps.

I've seen this used on https://gist.github.com/dbykadorov/9047455.

like image 28
Pablo Montilla Avatar answered Sep 24 '22 17:09

Pablo Montilla