Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do string formatting with unicode emdash?

I am trying do string formatting with a unicode variable. For example:

>>> x = u"Some text—with an emdash."
>>> x
u'Some text\u2014with an emdash.'
>>> print(x)
Some text—with an emdash.
>>> s = "{}".format(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2014' in position 9: ordinal not in range(128)

>>> t = "%s" %x
>>> t
u'Some text\u2014with an emdash.'
>>> print(t)
Some text—with an emdash.

You can see that I have a unicode string and that it prints just fine. The trouble is when I use Python's new (and improved?) format() function. If I use the old style (using %s) everything works out fine, but when I use {} and the format() function, it fails.

Any ideas of why this is happening? I am using Python 2.7.2.

like image 695
jlconlin Avatar asked Nov 16 '11 13:11

jlconlin


3 Answers

The new format() is not as forgiving when you mix ASCII and unicode strings ... so try this:

s = u"{}".format(x)
like image 50
wutz Avatar answered Nov 09 '22 20:11

wutz


The same way.

>>> s = u"{0}".format(x)
>>> s
u'Some text\u2014with an emdash.'
like image 42
Ignacio Vazquez-Abrams Avatar answered Nov 09 '22 20:11

Ignacio Vazquez-Abrams


Using the following worked well for me. It is a variant on the other answers.

>>> emDash = u'\u2014'
>>> "a{0}b".format(emDash)
'a—b'
like image 3
PolyGeo Avatar answered Nov 09 '22 19:11

PolyGeo