Pythonic way to ensure unicode in python 2 and 3

Question

I'm working on porting a library so that it is compatible with both python 2 and 3. The library receives strings or string-like objects from the calling application and I need to ensure those objects get converted to unicode strings.

In python 2 I can do:

unicode_x = unicode(x)

In python 3 I can do:

unicode_x = str(x)

However, the best cross-version solution I have is:

def ensure_unicode(x):
  if sys.version_info < (3, 0):
    return unicode(x)
  return str(x)

which certainly doesn't seem great (although it works). Is there a better solution?

I am aware of unicode_literals and the u prefix but both of those solutions do not work as the inputs come from clients and are not literals in my library.

Martijn Pieters · Accepted Answer

Don't re-invent the compatibility layer wheel. Use the six compatibility layer, a small one-file project that can be included with your own:

Six supports every Python version since 2.6. It is contained in only one Python file, so it can be easily copied into your project. (The copyright and license notice must be retained.)

It includes a six.text_type() callable that does exactly this, convert a value to Unicode text:

import six

unicode_x = six.text_type(x)

In the project source code this is defined as:

import sys

PY2 = sys.version_info[0] == 2
PY3 = sys.version_info[0] == 3
# ...

if PY3:
    # ...
    text_type = str
    # ...

else:
    # ...
    text_type = unicode
    # ...

Pythonic way to ensure unicode in python 2 and 3

Tags:

python

python-3.x

python-2.x

Pace

1 Answers

Martijn Pieters

Recent Activity

Donate For Us

Pythonic way to ensure unicode in python 2 and 3

Tags:

python

python-3.x

python-2.x

Pace

1 Answers

Martijn Pieters

Related questions

Recent Activity

Donate For Us