My application is written in python. What I am doing is I am running a script on each email received by postfix and do something with the email content. Procmail is responsible for running the script taking the email as input. The problem started when I was converting the input message(may be text) to email_message object(because the latter comes in handy). I am using email.message_from_string (where email is the default email module, comes with python).
import email
message = email.message_from_string(original_mail_content)
message_body = message.get_payload()
This message_body is sometimes returning a list[email.message.Message instance,email.message.Message instance] and sometime returning a string(actual body content of the incoming email). Why is it. And even I found one more observation. When I was browsing through the email.message.Message.get_payload() docstring, I found this..
"""
The payload will either be a list object or a string.If you mutate
the list object, you modify the message's payload in place....."""
So how do I have generic method to get the body of email through python? Please help me out.
As crazy as it might seem, the reason for the sometimes string, sometimes list-semantics are given in the documentation. Basically, multipart messages are returned as lists.
Rather than simply looking for a sub-part, use walk() to iterate through the message contents
def walkMsg(msg):
for part in msg.walk():
if part.get_content_type() == "multipart/alternative":
continue
yield part.get_payload(decode=1)
The walk() method returns an iterator that you can loop with (i.e. it's a generator). If the message is not a container of parts (i.e. has no attachments or alternates), the walk() method will then return an iterator with a single element - the message itself.
You want to skip any 'multipart' parts as they are just glue.
The above method returns all readable parts. You may want to expand this to simply return the text parts if they contain the info you are seeking.
Note that as of Python 2.5, methods get_type(), get_main_type(), and get_subtype() have been removed -> http://docs.python.org/library/email.message.html#email.message.Message.walk
Well, the answers are correct, you should read the docs, but for an example of a generic way:
def get_first_text_part(msg):
maintype = msg.get_content_maintype()
if maintype == 'multipart':
for part in msg.get_payload():
if part.get_content_maintype() == 'text':
return part.get_payload()
elif maintype == 'text':
return msg.get_payload()
This is prone to some disaster, as it is conceivable the parts themselves might have multiparts, and it really only returns the first text part, so this might be wrong too, but you can play with it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With