I'm building a system that reads emails from a gmail account and fetches the subjects, using Python's imaplib
and email
modules. Sometimes, emails received from a hotmail account have line breaks in their headers, for instance:
In [4]: message['From']
Out[4]: '=?utf-8?B?aXNhYmVsIG1hcsOtYSB0b2Npbm8gZ2FyY8OtYQ==?=\r\n\t<[email protected]>'
If I try to decode that header, it does nothing:
In [5]: email.header.decode_header(message['From'])
Out[5]: [('=?utf-8?B?aXNhYmVsIG1hcsOtYSB0b2Npbm8gZ2FyY8OtYQ==?=\r\n\t<[email protected]>', None)]
However, if I replace the line break and tab with a space, it works:
In [6]: email.header.decode_header(message['From'].replace('\r\n\t', ' '))
Out[6]: [('isabel mar\xc3\xada tocino garc\xc3\xada', 'utf-8'), ('<[email protected]>', None)]
Is this a bug in decode_header
? If not, I would like to know what other special cases like this I should be aware of.
It is a bug in decode_header
, which bug is present in python2.7 and fixed in python3.3.
>>> sys.version_info
sys.version_info(major=3, minor=3, micro=2, releaselevel='final', serial=0)
>>> email.header.decode_header('=?utf-8?B?aXNhYmVsIG1hcsOtYSB0b2Npbm8gZ2FyY8OtYQ==?=\r\n\t<[email protected]>')
[(b'isabel mar\xc3\xada tocino garc\xc3\xada', 'utf-8'), (b'<[email protected]>', None)]
vs
>>> sys.version_info
sys.version_info(major=2, minor=7, micro=5, releaselevel='final', serial=0)
>>> email.header.decode_header('=?utf-8?B?aXNhYmVsIG1hcsOtYSB0b2Npbm8gZ2FyY8OtYQ==?=\r\n\t<[email protected]>')
[('=?utf-8?B?aXNhYmVsIG1hcsOtYSB0b2Npbm8gZ2FyY8OtYQ==?=\r\n\t<[email protected]>', None)]
This error is still happening in some Python 2.7 versions, so the following workaround could be used:
>>> email.header.decode_header('=?utf-8?B?aXNhYmVsIG1hcsOtYSB0b2Npbm8gZ2FyY8OtYQ==?=\r\n\t<[email protected]>'.replace('\r\n\t', ' '))
[('isabel mar\xc3\xada tocino garc\xc3\xada', 'utf-8'), ('<[email protected]>', None)]
It replaces the CLRF and the tab feed for a whitespace. With this, decode_header will parse correctly the header.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With