Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't I display a unicode character in the Python Interpreter on Mac OS X Terminal.app?

If I try to paste a unicode character such as the middle dot:

·

in my python interpreter it does nothing. I'm using Terminal.app on Mac OS X and when I'm simply in in bash I have no trouble:

:~$ ·

But in the interpreter:

:~$ python
Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 

^^ I get nothing, it just ignores that I just pasted the character. If I use the escape \xNN\xNN representation of the middle dot '\xc2\xb7', and try to convert to unicode, trying to show the dot causes the interpreter to throw an error:

>>> unicode('\xc2\xb7')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 0: ordinal not in range(128)

I have setup 'utf-8' as my default encoding in sitecustomize.py so:

>>> sys.getdefaultencoding()
'utf-8'

What gives? It's not the Terminal. It's not Python, what am I doing wrong?!

This question is not related to this question, as that indivdiual is able to paste unicode into his Terminal.

like image 1000
Bjorn Avatar asked Apr 27 '10 03:04

Bjorn


1 Answers

unicode('\xc2\xb7') means to decode the byte string in question with the default codec, which is ascii -- and that of course fails (trying to set a different default encoding has never worked well, and in particular doesn't apply to "pasted literals" -- that would require a different setting anyway). You could use instead u'\xc2\xb7', and see:

>>> print(u'\xc2\xb7')
·

since those are two unicode characters of course. While:

>>> print(u'\uc2b7')
슷

gives you a single unicode character (of some oriental persuasion -- sorry, I'm ignorant about these things). BTW, neither of these is the "middle dot" you were looking for. Maybe you mean

>>> print('\xc2\xb7'.decode('utf8'))
·

which is the middle dot. BTW, for me (python 2.6.4 from python.org on a Mac Terminal.app):

>>> print('슷')
슷

which kind of surprised me (I expected an error...!-).

like image 72
Alex Martelli Avatar answered Sep 27 '22 17:09

Alex Martelli