Edit: I'm talking about behavior in Python 2.7.
The chr
function converts integers between 0 and 127 into the ASCII characters. E.g.
>>> chr(65)
'A'
I get how this is useful in certain situations and I understand why it covers 0..127, the 7-bit ASCII range.
The function also takes arguments from 128..255. For these numbers, it simply returns the hexadecimal representation of the argument. In this range, different bytes mean different things depending on which part of the ISO-8859 standard is used.
I'd understand if chr
took another argument, e.g.
>>> chr(228, encoding='iso-8859-1') # hypothetical
'ä'
However, there is no such option:
chr(i) -> character
Return a string of one character with ordinal i; 0 <= i < 256.
My questions is: What is the point of raising ValueError
for i > 255
instead of i > 127
? All the function does for 128 <= i < 256
is return hex values?
The limit occurs due to an optimization technique where smaller strings are stored with the first byte holding the length of the string. Since a byte can only hold 256 different values, the maximum string length would be 255 since the first byte was reserved for storing the length.
Python chr() function takes integer argument and return the string representing a character at that code point. Since chr() function takes an integer argument and converts it to character, there is a valid range for the input.
chr () in Python For example, chr(65) returns the string 'A', while chr(126) returns the string '~'.
The Python ord() function converts a character into an integer that represents the Unicode code of the character. Similarly, the chr() function converts a Unicode code character into the corresponding string.
In Python 2.x, a str
is a sequence of bytes, so chr()
returns a string of one byte and accepts values in the range 0-255, as this is the range that can be represented by a byte. When you print the repr()
of a string with a byte in the range 128-255, the character is printed in escape format because there is no standard way to represent such characters (ASCII defines only 0-127). You can convert it to Unicode using unicode()
however, and specify the source encoding:
unicode(chr(200), encoding="latin1")
In Python 3.x, str
is a sequence of Unicode characters and chr()
takes a much larger range. Bytes are handled by the bytes
type.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With