Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python IMAP search using a subject encoded with iso-8859-1

From a different account, I sent myself an email with the subject Test de réception en local. Now using IMAP, I want to find that email searching by subject.

When doing a search for ALL and finding the email among the output, I see:
Subject: =?ISO-8859-1?Q?Test_de_r=E9ception_en_local?=

So now, searching with imap, I try:

M = imaplib.IMAP4_SSL('imap.gmail.com', 993)
M.login('[email protected]', 'password')
M.select('[Gmail]/All Mail')

subject = Header(email_model.subject, 'iso-8859-1').encode() #email_model.subject is in unicode, utf-8 encoded
typ, data = M.search('iso-8859-1', '(SUBJECT "%s")' % subject)
for num in data[0].split():
    typ, data = M.fetch(num, '(RFC822)')
    print 'Message %s\n%s\n' % (num, data[0][1])
M.close()
M.logout()

print 'Fin'

If you print out subject, you see that the result appears just the same as what I'm getting from the IMAP server on my prior, more-broad search. Yet, it doesn't seem to make a match when doing this more specific search.

For the search, I have tried everything I can think of:

typ, data = M.search('iso-8859-1', '(HEADER subject "%s")' % subject)
typ, data = M.search('iso-8859-1', 'ALL (SUBJECT "%s")' % subject)

And others that I can't recall at the moment, all without any luck.

I can search (and match) for emails that have subjects that only use ASCII, but it doesn't work with any subject that has an encoding applied. So...

With IMAP, what is the proper way to search for an email using a subject that has an encoding applied?

Thanks

like image 463
rfadams Avatar asked Jan 21 '23 00:01

rfadams


1 Answers

When talking to IMAP servers, check with IMAP RFC.

You must remove extra quotes, and you must not encode the strings. Also, charset specifies the charset of the search query, not the charset of the message header. This should work (works for me):

M.search("utf-8", "(SUBJECT %s)" % u"réception".encode("utf-8"))
# this also works:
M.search("iso8859-1", "(SUBJECT %s)" % u"réception".encode("iso8859-1"))

Edit:

Apparently some servers (at least gmail as of August 2013) support utf-8 strings only when sent as literals. Python imaplib has a very limited literal arguments support, the best one can do is something like:

term = u"réception".encode("utf-8")
M.literal = term
M.search("utf-8", "SUBJECT")
like image 184
abbot Avatar answered Jan 29 '23 08:01

abbot