python unicode handling differences between print and sys.stdout.write

Question

I'll start by saying that I've already seen this post: Strange python print behavior with unicode, but the solution offered there (using PYTHONIOENCODING) didn't work for me.

Here's my issue:

Python 2.6.5 (r265:79063, Apr  9 2010, 11:16:46)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-44)] on linux2
>>> a = u'\xa6'
>>> print a 
Â¦

works just fine, however:

>>> sys.stdout.write(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa6' in position 0: ordinal not in range(128)

throws an error. The post I linked to at the top suggests that this is because the default console encoding is 'ascii'. However, in my case it's not:

>>> sys.stdout.encoding
'UTF-8'

So any thoughts on what's at work here and how to fix this issue?

Thanks D.

ekhumoro · Accepted Answer

This is due to a long-standing bug that was fixed in python-2.7, but too late to be back-ported to python-2.6.

The documentation states that when unicode strings are written to a file, they should be converted to byte strings using file.encoding. But this was not being honoured by sys.stdout, which instead was using the default unicode encoding. This is usually set to "ascii" by the site module, but it can be changed with sys.setdefaultencoding:

Python 2.6.7 (r267:88850, Aug 14 2011, 12:32:40) [GCC 4.6.2] on linux3
>>> a = u'\xa6
'
>>> sys.stdout.write(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec cant encode character u'\xa6' ...
>>> reload(sys).setdefaultencoding('utf8')
>>> sys.stdout.write(a)
¦

However, a better solution might be to replace sys.stdout with a wrapper:

class StdOut(object):
    def write(self, string):
        if isinstance(string, unicode):
            string = string.encode(sys.__stdout__.encoding)
        sys.__stdout__.write(string)

>>> sys.stdout = StdOut()
>>> sys.stdout.write(a)
¦

python unicode handling differences between print and sys.stdout.write

Tags:

python

stdout

unicode

python-2.7

Dmitry B.

1 Answers

ekhumoro

Recent Activity

Donate For Us

python unicode handling differences between print and sys.stdout.write

Tags:

python

stdout

unicode

python-2.7

Dmitry B.

1 Answers

ekhumoro

Related questions

Recent Activity

Donate For Us