Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I override the str function without raising a UnicodeEncodeError?

I am puzzled that defining __str__ for a class seems to have no effect on using the str function on a class instance. For example, I read in the Django documentation that:

The print statement and the str built-in call __str__() to determine the human-readable representation of an object.

But that doesn't appear to be true. Here's an example from a module where text is always assumed to be unicode:

import six

class Test(object):

    def __init__(self, text):
        self._text = text

    def __str__(self):
        if six.PY3:
            return str(self._text)
        else:
            return unicode(self._text)

    def __unicode__(self):
        if six.PY3:
            return str(self._text)
        else:
            return unicode(self._text)

In Python 2, it gives the following behavior:

>>> a=Test(u'café')
>>> print a.__str__()
café
>>> print a # same error with str(a)
---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
<ipython-input-63-202e444820fd> in <module>()
----> 1 str(a)

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 3: ordinal not in range(128)

Is there a way to overload the str function?

like image 802
Ray Osborn Avatar asked May 06 '16 21:05

Ray Osborn


People also ask

How do you override type in Python?

In Python method overriding occurs by simply defining in the child class a method with the same name of a method in the parent class. When you define a method in the object you make this latter able to satisfy that method call, so the implementations of its ancestors do not come in play.

What is the purpose of defining the functions __ str __ and __ repr __ within a class how are the two functions different?

Summary. Both __str__ and __repr__ functions return string representation of the object. The __str__ string representation is supposed to be human-friendly and mostly used for logging purposes, whereas __repr__ representation is supposed to contain information about object so that it can be constructed again.

What is the difference between STR and repr?

Following are differences: str() is used for creating output for end user while repr() is mainly used for debugging and development. repr's goal is to be unambiguous and str's is to be readable.

How do you create a Tostring method in Python?

In Python, the equivalent of the tostring() is the str() function. The str() is a built-in function. It can convert an object of a different type to a string. When we call this function, it calls the __str__() function internally to get the representation of the object as a string.


1 Answers

For Python 2, you are returning the wrong type from the __str__ method. You are returning unicode, while you must return str:

def __str__(self):
    if six.PY3:
        return str(self._text)
    else:
        return self._text.encode('utf8')

Because self._text is not already of type str, you'll need to encode it. Because you returned Unicode instead, Python is forced to encode it first, but the default ASCII encoding can't handle the non-ASCII é character.

Printing the object results in the right output only because my terminal is configured to handle UTF-8:

>>> a = Test(u'café')
>>> str(a)
'caf\xc3\xa9'
>>> print a
café
>>> unicode(a)
u'caf\xe9'

Note that there is no __unicode__ method in Python 3; your if six.PY3 in that method is entirely redundant. The following would work too:

class Test(object):
    def __init__(self, text):
        self._text = text

    def __str__(self):
        if six.PY3:
            return self._text
        else:
            return self._text.encode('utf8')

    def __unicode__(self):
        return self._text

However, if you are using the six library, you'd be far better of using the @six.python_2_unicode_compatible decorator, and only define a Python 3 version for the __str__ method:

@six.python_2_unicode_compatible
class Test(object):
    def __init__(self, text):
        self._text = text

    def __str__(self):
        return self._text

where it is assumed text is always Unicode. If you are working with Django, then you can get the same decorator from the django.utils.encoding module.

like image 127
Martijn Pieters Avatar answered Sep 24 '22 02:09

Martijn Pieters