Here is a method which tries to get the html part of an email message:
from __future__ import absolute_import, division, unicode_literals, print_function
import email
html_mail_quoted_printable=b'''Subject: =?ISO-8859-1?Q?WG=3A_Wasenstra=DFe_84_in_32052_Hold_Stau?=
MIME-Version: 1.0
Content-type: multipart/mixed;
Boundary="0__=4EBBF4C4DFD012538f9e8a93df938690918c4EBBF4C4DFD01253"
--0__=4EBBF4C4DFD012538f9e8a93df938690918c4EBBF4C4DFD01253
Content-type: multipart/alternative;
Boundary="1__=4EBBF4C4DFD012538f9e8a93df938690918c4EBBF4C4DFD01253"
--1__=4EBBF4C4DFD012538f9e8a93df938690918c4EBBF4C4DFD01253
Content-type: text/plain; charset=ISO-8859-1
Content-transfer-encoding: quoted-printable
Freundliche Gr=FC=DFe
--1__=4EBBF4C4DFD012538f9e8a93df938690918c4EBBF4C4DFD01253
Content-type: text/html; charset=ISO-8859-1
Content-Disposition: inline
Content-transfer-encoding: quoted-printable
<html><body>
Freundliche Gr=FC=DFe
</body></html>
--1__=4EBBF4C4DFD012538f9e8a93df938690918c4EBBF4C4DFD01253--
--0__=4EBBF4C4DFD012538f9e8a93df938690918c4EBBF4C4DFD01253--
'''
def get_html_part(msg):
for part in msg.walk():
if part.get_content_type() == 'text/html':
return part.get_payload(decode=True)
msg=email.message_from_string(html_mail_quoted_printable)
html=get_html_part(msg)
print(type(html))
print(html)
Output:
<type 'str'>
<html><body>
Freundliche Gr��e
</body></html>
Unfortunately I get a byte string. I would like to have unicode string.
According to this answer msg.get_payload(decode=True)
should do the magic. But it does not in this case.
How to decode a mime part of a message and get a unicode string in Python 2.7?
Unfortunately I get a byte string. I would like to have unicode string.
The decode=True
parameter to get_payload
only decodes the Content-Transfer-Encoding
wrapper, the =
-encoding in this message. To get from there to characters is one of the many things the email
package makes you do yourself:
bytes = part.get_payload(decode=True)
charset = part.get_content_charset('iso-8859-1')
chars = bytes.decode(charset, 'replace')
(iso-8859-1
being the fallback in case the message specifies no encoding.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With