Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Different encodings used in in "print s" vs "print [s]"?

Tags:

python

unicode

When I do the following in IPython notebook

s='½'
s
print s
print [s]

I see

'\xc2\xbd'
½
['\xc2\xbd']
  1. What's going on here?
  2. How do I print a list of Unicode strings? (ie I want to see ['½'])

Edit So from comments, looks like the difference is that "print s" uses s.__str__ and "s", "print [s]" uses it's s.__repr__

like image 716
Yaroslav Bulatov Avatar asked Nov 01 '15 19:11

Yaroslav Bulatov


1 Answers

You can use repr function to create a string containing a printable representation of your list, then decode your string with string-escape encoding which will returns a byte string of your string. Then by printing the byte string your terminal will encode it automatically by it's default encoding (usually UTF8) :

>>> print repr([s]).decode('string-escape')
['½']

But note that since in python 3.X we just have unicode, you don't need to use this trick :

Python 3.4.3 (default, Oct 14 2015, 20:28:29) 
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> s='½'
>>> print ([s])
['½']

For more info about python encodings read https://docs.python.org/2.4/lib/standard-encodings.html

like image 181
Mazdak Avatar answered Sep 18 '22 01:09

Mazdak