Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 3 CGI: how to output raw bytes

I decided to use Python 3 for making my website, but I encountered a problem with Unicode output.

It seems like plain print(html) #html is astr should be working, but it's not. I get UnicodeEncodeError: 'ascii' codec can't encode characters[...]: ordinal not in range(128). This must be because the webserver doesn't support unicode output.

The next thing I tried was print(html.encode('utf-8')), but I got something like repr output of the byte string: it is placed inside b'...' and all the escape characters are in raw form (e.g. \n and \xd0\x9c)

Please show me the correct way to output a Unicode (str) string as a raw UTF-8 encoded bytes string in Python 3.1

like image 738
Oleh Prypin Avatar asked Apr 01 '11 14:04

Oleh Prypin


1 Answers

The problem here is that you stdout isn't attached to an actual terminal and will use the ASCII encoding by default. Therefore you need to write to sys.stdout.buffer, which is the "raw" binary output of sys.stdout. This can be done in various ways, the most common one seems to be:

import codecs, sys
writer = codecs.getwriter('utf8')(sys.stdout.buffer)

And the use writer. In a CGI script you may be able to replace sys.stdout with the writer so:

sys.stdout = codecs.getwriter('utf8')(sys.stdout.buffer)

Might actually work so you can print normally. Try that!

like image 175
Lennart Regebro Avatar answered Oct 04 '22 03:10

Lennart Regebro