I am trying to send emails that contain non-ASCII characters using the SmtpClient
and MailMessage
classes.
I am using an external mailing service (MailChimp) and some of my emails have been rejected by their SMTP server. I have contacted them and this is what they replied:
It appears the subject line is being Base64 encoded and then Quoted-Printable encoded, which generally should be fine, but one of the characters is being broken across two lines. So when your subject lines are a bit longer, in order to be processed correctly, it's broken on to two lines. When using UTF-8 quoted printable in a subject line, character strings aren't supposed to be broken between lines. Instead a line should be shorted so that the full character string remains together. In this case, that's not happening, so the string of characters that represents a single character is being broken across multiple lines, and therefore isn't validly UTF-8 quoted-printable encoded.
The problematic subject is the following:
Subject: XXXXXXX - 5 personnes vous ont nommé guide
Which is, in UTF-8/Base64:
Subject: WFhYWFhYWCAtIDUgcGVyc29ubmVzIHZvdXMgb250IG5vbW3DqSBndWlkZQ==
Because that header would exceed a certain maximum length (I am unsure whether it is the Quoted-Printable encoding and its limit of 76 characters per line, or the SMTP header limit), after encoding and split, the header will become:
Subject: =?utf-8?B?WFhYWFhYWCAtIDUgcGVyc29ubmVzIHZvdXMgb250IG5vbW3D?=
=?utf-8?B?qSBndWlkZQ==?=
Apparently this causes an issue when decoding (because the first line cannot be decoded to a valid string). I am not sure I fully understand the problem, and I have the following questions:
Also note that some other SMTP servers will accept this message, though that does not mean it is valid.
As a workaround, I have tried disabling the Base64 encoding, which apparently is unnecessary, however the MailMessage class has a BodyTransferEncoding property that controls this encoding, but only for the body part of the message. No property seems to control the "transfer" encoding of the subject.
This was confirmed as a bug in the MSDN forums:
http://social.msdn.microsoft.com/Forums/vstudio/en-US/4d1c1752-70ba-420a-9510-8fb4aa6da046/subject-encoding-on-smtpclientmailmessage
And a bug was filed on Microsoft Connect: https://connect.microsoft.com/VisualStudio/feedback/details/785710/mailmessage-subject-incorrectly-encoded-in-utf-8-base64
One work-around is to set the SubjectEncoding of the MailMessage to an other encoding, such as ISO-8859-1. In this case, the subject will be encoded in Quoted Printable (not Base64) which avoids the problem.
A better solution is to use Encoding.Unicode
instead of Encoding.UTF8
for the SubjectEncoding
.
It appears that, as the Microsoft implementation simply ignores the reality of UTF-16 being able to encode characters in more than two bytes (as seen on Why does C# use UTF-16 for strings?), the stable character size helps.
I've seen this used on https://gist.github.com/dbykadorov/9047455.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With