I want to retrieve body (only text) of emails using python imap and email package.
As per this SO thread, I'm using the following code:
mail = email.message_from_string(email_body)
bodytext = mail.get_payload()[ 0 ].get_payload()
Though it's working fine for some instances, but sometime I get similar to following response
[<email.message.Message instance at 0x0206DCD8>, <email.message.Message instance at 0x0206D508>]
You are assuming that messages have a uniform structure, with one well-defined "main part". That is not the case; there can be messages with a single part which is not a text part (just an "attachment" of a binary file, and nothing else) or it can be a multipart with multiple textual parts (or, again, none at all) and even if there is only one, it need not be the first part. Furthermore, there are nested multiparts (one or more parts is another MIME message, recursively).
In so many words, you must inspect the MIME structure, then decide which part(s) are relevant for your application. If you only receive messages from a fairly static, small set of clients, you may be able to cut some corners (at least until the next upgrade of Microsoft Plague hits) but in general, there simply isn't a hierarchy of any kind, just a collection of (not necessarily always directly related) equally important parts.
The main problem in my case is that replied or forwarded message shown as message instance in the bodytext.
Solved my problem using the following code:
bodytext=mail.get_payload()[0].get_payload();
if type(bodytext) is list:
bodytext=','.join(str(v) for v in bodytext)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With