Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Encoding of headers in MIMEText

I'm using MIMEText to create an email from scratch in Python 3.2, and I have trouble creating messages with non-ascii characters in the subject.

For example

from email.mime.text import MIMEText
body = "Some text"
subject = "» My Subject"                   # first char is non-ascii
msg = MIMEText(body,'plain','utf-8')
msg['Subject'] = subject                   # <<< Problem probably here
text = msg.as_string()

The last line gives me the error

UnicodeEncodeError: 'ascii' codec can't encode character '\xbb' in position 0: ordinal not in range(128)

How do I tell MIMEText that the subject is not ascii ? subject.encode('utf-8') doesn't help at all, and anyway I've seen people using unicode strings with no problems in other answers (see for example Python - How to send utf-8 e-mail?)

Edit: I'd like to add that the same code doesn't give any error in Python 2.7 (thought that doesn't mean that the result is correct).

like image 729
Marco Righele Avatar asked Apr 18 '13 13:04

Marco Righele


2 Answers

I found the solution. Email headers containing non ascii characters need to be encoded as per RFC 2047. In Python this means using email.header.Header instead of a regular string for header content (see http://docs.python.org/2/library/email.header.html). The right way to write the above example is then

from email.mime.text import MIMEText
from email.header import Header
body = "Some text"
subject = "» My Subject"                   
msg = MIMEText(body,'plain','utf-8')
msg['Subject'] = Header(subject,'utf-8')
text = msg.as_string()

The subject string will be encoded in the email as

=?utf-8?q?=C2=BB_My_Subject?=

The fact the in python 2.x the previous code was working for me is probably related to the mail client being able to interpret the wrongly encoded header.

like image 187
Marco Righele Avatar answered Nov 12 '22 05:11

Marco Righele


I've found that replacing

msg['Subject'] = subject

with

msg.add_header('Subject', subject)

works for getting UTF-8 to display. If you want another character-set, you can do that, too. Try help(msg.add_header) to see the docs on that (replace the value, that is subject with a tuple containing three elements: (charset, language, value).

Anyway, this seems a little simpler than the other method—so, I thought I'd mention it. I decided to try this since add_header seems to work more often for the 'reply-to' header than just doing msg["reply-to"]=your_reply_to_email. So, I thought maybe it would be better for subjects, too—and the docs said it supported UTF-8 by default (which I tested, and it worked).

like image 38
Brōtsyorfuzthrāx Avatar answered Nov 12 '22 04:11

Brōtsyorfuzthrāx