Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to know the maximum number accepted by chr using Python?

From the Python's documentation of the chr built-in function, the maximum value that chr accepts is 1114111 (in decimal) or 0x10FFFF (in base 16). And in fact

>>> chr(1114112)
Traceback (most recent call last):
  File "<pyshell#20>", line 1, in <module>
    chr(1114112)
ValueError: chr() arg not in range(0x110000)

My first question is the following, why exactly that number? The second question is: if this number changes, is it possible to know from a Python command the maximum value accepted by chr?

like image 601
nbro Avatar asked Jan 15 '16 14:01

nbro


1 Answers

Use sys.maxunicode:

An integer giving the value of the largest Unicode code point, i.e. 1114111 (0x10FFFF in hexadecimal).

On my Python 2.7 UCS-2 build the maximum Unicode character supported by unichr() is 0xFFFF:

>>> import sys
>>> sys.maxunicode
65535

but Python 3.3 and newer switched to a new internal storage format for Unicode strings, and the maximum is now always 0x10FFFF. See PEP 393.

0x10FFFF is the maximum Unicode codepoint as defined in the Unicode standard. Quoting the Wikipedia article on Unicode:

Unicode defines a codespace of 1,114,112 code points in the range 0 to 10FFFF.

like image 193
Martijn Pieters Avatar answered Sep 21 '22 11:09

Martijn Pieters