Lately, I've had lots of trouble with __repr__(), format(), and encodings.  Should the output of __repr__() be encoded or be a unicode string?  Is there a best encoding for the result of __repr__() in Python?  What I want to output does have non-ASCII characters.
I use Python 2.x, and want to write code that can easily be adapted to Python 3. The program thus uses
# -*- coding: utf-8 -*-
from __future__ import unicode_literals, print_function  # The 'Hello' literal represents a Unicode object
Here are some additional problems that have been bothering me, and I'm looking for a solution that solves them:
sys.stdout.encoding set to UTF-8, but it would be best if other cases worked too).sys.stdout.encoding is None).__repr__() functions currently has many return ….encode('utf-8'), and that's heavy.  Is there anything robust and lighter?return ('<{}>'.format(repr(x).decode('utf-8'))).encode('utf-8'), i.e., the representation of objects is decoded, put into a formatting string, and then re-encoded.  I would like to avoid such convoluted transformations.What would you recommend to do in order to write simple __repr__() functions that behave nicely with respect to these encoding questions?
In Python2, __repr__ (and __str__) must return a string object, not a unicode object. In Python3, the situation is reversed, __repr__ and __str__ must return unicode objects, not byte (née string) objects:
class Foo(object):     def __repr__(self):         return u'\N{WHITE SMILING FACE}'   class Bar(object):     def __repr__(self):         return u'\N{WHITE SMILING FACE}'.encode('utf8')  repr(Bar()) # ☺ repr(Foo()) # UnicodeEncodeError: 'ascii' codec can't encode character u'\u263a' in position 0: ordinal not in range(128)   In Python2, you don't really have a choice. You have to pick an encoding for the return value of __repr__.
By the way, have you read the PrintFails wiki? It may not directly answer your other questions, but I did find it helpful in illuminating why certain errors occur.
When using from __future__ import unicode_literals, 
'<{}>'.format(repr(x).decode('utf-8'))).encode('utf-8')   can be more simply written as
str('<{}>').format(repr(x))   assuming str encodes to utf-8 on your system. 
Without from __future__ import unicode_literals, the expression can be written as:
'<{}>'.format(repr(x)) 
                        I think a decorator can manage __repr__ incompatibilities in a sane way. Here's what i use:
from __future__ import unicode_literals, print_function
import sys
def force_encoded_string_output(func):
    if sys.version_info.major < 3:
        def _func(*args, **kwargs):
            return func(*args, **kwargs).encode(sys.stdout.encoding or 'utf-8')
        return _func
    else:
        return func
class MyDummyClass(object):
    @force_encoded_string_output
    def __repr__(self):
        return 'My Dummy Class! \N{WHITE SMILING FACE}'
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With