Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unicode strings syntax in Python

Official Python tutorial states that Unicode strings in Python can be used like this:

u'Hello World !'

But when I put it to IDLE - Python GUI of Python 3.2, it gives me a syntax error. Also Russian and Chinese text can be succcessfully stored in that Python strings, so I guess they are already Unicode.

Could you please explain what's happening?

like image 614
Sergey Avatar asked Dec 28 '22 08:12

Sergey


1 Answers

by default python 3.2 works with unicode strings so the u is no longer needed.

If you want to encode and decode strings you should use:

encoded = "unicodestring".encode("UTF8")

decoded = s.decode("UTF8")

The Python documetation states that:

Python 3.0 uses the concepts of text and (binary) data instead of Unicode strings and 8-bit strings. All text is Unicode; however encoded Unicode is represented as binary data. The type used to hold text is str

and

You can no longer use u"..." literals for Unicode text. However, you must use b"..." literals for binary data.

like image 132
Serdalis Avatar answered Dec 30 '22 11:12

Serdalis