From the Python's documentation of the chr
built-in function, the maximum value that chr
accepts is 1114111 (in decimal) or 0x10FFFF (in base 16). And in fact
>>> chr(1114112)
Traceback (most recent call last):
File "<pyshell#20>", line 1, in <module>
chr(1114112)
ValueError: chr() arg not in range(0x110000)
My first question is the following, why exactly that number? The second question is: if this number changes, is it possible to know from a Python command the maximum value accepted by chr
?
Use sys.maxunicode
:
An integer giving the value of the largest Unicode code point, i.e.
1114111
(0x10FFFF
in hexadecimal).
On my Python 2.7 UCS-2 build the maximum Unicode character supported by unichr()
is 0xFFFF:
>>> import sys
>>> sys.maxunicode
65535
but Python 3.3 and newer switched to a new internal storage format for Unicode strings, and the maximum is now always 0x10FFFF
. See PEP 393.
0x10FFFF
is the maximum Unicode codepoint as defined in the Unicode standard. Quoting the Wikipedia article on Unicode:
Unicode defines a codespace of 1,114,112 code points in the range 0 to 10FFFF.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With