I am trying do string formatting with a unicode variable. For example:
>>> x = u"Some text—with an emdash."
>>> x
u'Some text\u2014with an emdash.'
>>> print(x)
Some text—with an emdash.
>>> s = "{}".format(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2014' in position 9: ordinal not in range(128)
>>> t = "%s" %x
>>> t
u'Some text\u2014with an emdash.'
>>> print(t)
Some text—with an emdash.
You can see that I have a unicode string and that it prints just fine. The trouble is when I use Python's new (and improved?) format()
function. If I use the old style (using %s
) everything works out fine, but when I use {}
and the format()
function, it fails.
Any ideas of why this is happening? I am using Python 2.7.2.
The new format()
is not as forgiving when you mix ASCII and unicode strings ... so try this:
s = u"{}".format(x)
The same way.
>>> s = u"{0}".format(x)
>>> s
u'Some text\u2014with an emdash.'
Using the following worked well for me. It is a variant on the other answers.
>>> emDash = u'\u2014'
>>> "a{0}b".format(emDash)
'a—b'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With