I am using Ubuntu 12.04 LTS. When I try something like this in terminal:
rfx@digest:/usr/share/fonts/truetype/ttf-dejavu$ echo вдлжофыдвж
вдлжофыдвж
Symbols are shown correctly. But if try to print unicode symbols using python 2.7 I get this:
>>> print u'абв'
ц│ц┌ц≈
As python shows I have utf-8 encoding by default for terminal:
>>> sys.stdout.encoding
'UTF-8'
To print any character in the Python interpreter, use a \u to denote a unicode character and then follow with the character code. For instance, the code for β is 03B2, so to print β the command is print('\u03B2') .
Only a limited number of Unicode characters are mapped to strings. Thus, any character that is not-represented / mapped will cause the encoding to fail and raise UnicodeEncodeError. To avoid this error use the encode( utf-8 ) and decode( utf-8 ) functions accordingly in your code.
Python's string type uses the Unicode Standard for representing characters, which lets Python programs work with all these different possible characters.
If encoding and/or errors are given, unicode() will decode the object which can either be an 8-bit string or a character buffer using the codec for encoding. The encoding parameter is a string giving the name of an encoding; if the encoding is not known, LookupError is raised.
Your input is being improperly deciphered by the terminal. This is not a Python problem.
To prove it, use the unicode representation:
myunicode = u'\u0430\u0431\u0432'
print myunicode
print myunicode.encode('utf-8')
If this does not print the original string абв
twice, then you need to configure your terminal emulator program correctly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With