Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

32-bit unicode in python

Tags:

python

unicode

Python has an escape sequence \u to display unicode values. However this is restricted to only 16 bit unicode values. That is

>>> '\u1020'
'ဠ'

Whereas 32 bit uncode values do not work. That is

>>> '\u00001000'
'\x001000'

Which is obviously wrong. The python documentation mentions

The escape sequence \u0020 indicates to insert the Unicode character with the ordinal value 0x0020 (the space character) at the given position.

like image 228
Bhargav Rao Avatar asked Dec 30 '14 09:12

Bhargav Rao


1 Answers

The python How To Unicode clearly mentions the use of '\U' to represent 32-bit unicode sequences.

>>> "\u0394"                          # Using a 16-bit hex value
'Δ'
>>> "\U00000394"                      # Using a 32-bit hex value
'Δ'

In this case

>>> '\U00001000'
'က'
like image 72
Bhargav Rao Avatar answered Sep 23 '22 13:09

Bhargav Rao