I am working on setting up a script that forwards incoming mail to a list of recipients. Here's what I have now: I read the email from stdin (that's how postfix passes it): <pre class="prettyprint"><code>email_in = sys.stdin.read() incoming = Parser().parse(email_in) sender = incoming['from'] this_address = incoming['to'] </code></pre> I test for multipart: <pre class="prettyprint"><code>if incoming.is_multipart(): for payload in incoming.get_payload(): # if payload.is_multipart(): ... body = payload.get_payload() else: body = incoming.get_payload(decode=True)` </code></pre> I set up the outgoing message: <pre class="prettyprint"><code>msg = MIMEMultipart() msg['Subject'] = incoming['subject'] msg['From'] = this_address msg['reply-to'] = sender msg['To'] = "foo@bar.com" msg.attach(MIMEText(body.encode('utf-8'), 'html', _charset='UTF-8')) s = smtplib.SMTP('localhost') s.send_message(msg) s.quit() </code></pre> This works pretty well with ASCII characters (English text), forwards it and all. When I send non-ascii characters though, it gives back gibberish (depending on email client bytes or ascii representations of the utf-8 chars) What can be the problem? Is it on the incoming or the outgoing side?

The problem is that many email clients (including Gmail) send non-ascii emails in base64. <code>stdin</code> on the other hand passes everything into a string. If you parse that with <code>Parser.parse()</code>, it returns a string type with base64 inside. Instead the optional <code>decode</code> argument should be used on the <code>get_payload()</code> method. When that is set, the method returns a bytes type. After that you can use the builtin <code>decode()</code> method to get utf-8 string like so: <pre class="prettyprint"><code>body = payload.get_payload(decode=True) body = body.decode('utf-8') </code></pre> There is great insight into utf-8 and python in Ned Batchelder's talk. My final code works a bit differently, you can check that, too here.

Python 3 email body encoding

Tags:

python

email

python-3.x

utf-8

I am working on setting up a script that forwards incoming mail to a list of recipients.

Here's what I have now:

I read the email from stdin (that's how postfix passes it):

email_in = sys.stdin.read()

incoming = Parser().parse(email_in)

sender = incoming['from']
this_address = incoming['to']

I test for multipart:

if incoming.is_multipart():
    for payload in incoming.get_payload():
        # if payload.is_multipart(): ...
        body = payload.get_payload()
else:
    body = incoming.get_payload(decode=True)`

I set up the outgoing message:

msg = MIMEMultipart()
msg['Subject'] = incoming['subject']
msg['From'] = this_address
msg['reply-to'] = sender
msg['To'] = "[email protected]"
msg.attach(MIMEText(body.encode('utf-8'), 'html', _charset='UTF-8'))

s = smtplib.SMTP('localhost')
s.send_message(msg)
s.quit()

This works pretty well with ASCII characters (English text), forwards it and all.

When I send non-ascii characters though, it gives back gibberish (depending on email client bytes or ascii representations of the utf-8 chars)

What can be the problem? Is it on the incoming or the outgoing side?

553

asked Nov 18 '14 15:11

fonorobert

1 Answers

The problem is that many email clients (including Gmail) send non-ascii emails in base64. stdin on the other hand passes everything into a string. If you parse that with Parser.parse(), it returns a string type with base64 inside.

Instead the optional decode argument should be used on the get_payload() method. When that is set, the method returns a bytes type. After that you can use the builtin decode() method to get utf-8 string like so:

body = payload.get_payload(decode=True)
body = body.decode('utf-8')

There is great insight into utf-8 and python in Ned Batchelder's talk.

My final code works a bit differently, you can check that, too here.

answered Oct 13 '22 09:10

fonorobert

Related questions
                            
                                Sorting a multi-index while respecting its index structure
                            
                                Multichannel PyAudio with ASIO Support
                            
                                the requested URL was not found on this server django
                            
                                coordinates conversion with pyproj
                            
                                Django timezone.make_aware raised AmbiguousTimeError for 2014-10-26 1:45:00
                            
                                Python format print with a list
                            
                                How to run project files using Anaconda from any directory in Windows
                            
                                how to combine 2 lists uniquely
                            
                                Performing grid search on sklearn.naive_bayes.MultinomialNB on multi-core machine doesn’t use all the available CPU resources
                            
                                Why I got tornado.autoreload started more than once in testing?
                            
                                How to make use of the filesystem cache in Java or Python?
                            
                                psutil.process_iter() doesn't return all running processes
                            
                                Lost in pudb command line area
                            
                                Color points according to their contour color
                            
                                Finding the recurring pattern
                            
                                import image to python as 2D array
                            
                                Dump function variables to workspace in python/ipython
                            
                                Sending specific hex data using scapy
                            
                                Append to Series in python/pandas not working
                            
                                Python running out of memory parsing XML using cElementTree.iterparse

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python 3 email body encoding

Tags:

python

email

python-3.x

utf-8

fonorobert

People also ask

1 Answers

fonorobert

Recent Activity

Donate For Us