Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the point of chr(128) .. chr(255) in Python?

Tags:

python

ascii

Edit: I'm talking about behavior in Python 2.7.

The chr function converts integers between 0 and 127 into the ASCII characters. E.g.

>>> chr(65)
'A'

I get how this is useful in certain situations and I understand why it covers 0..127, the 7-bit ASCII range.

The function also takes arguments from 128..255. For these numbers, it simply returns the hexadecimal representation of the argument. In this range, different bytes mean different things depending on which part of the ISO-8859 standard is used.

I'd understand if chr took another argument, e.g.

>>> chr(228, encoding='iso-8859-1') # hypothetical
'ä'

However, there is no such option:

chr(i) -> character

Return a string of one character with ordinal i; 0 <= i < 256.

My questions is: What is the point of raising ValueError for i > 255 instead of i > 127? All the function does for 128 <= i < 256 is return hex values?

like image 817
malana Avatar asked Nov 19 '14 23:11

malana


People also ask

Why is there a 255 limit?

The limit occurs due to an optimization technique where smaller strings are stored with the first byte holding the length of the string. Since a byte can only hold 256 different values, the maximum string length would be 255 since the first byte was reserved for storing the length.

What does the CHR () function do in Python?

Python chr() function takes integer argument and return the string representing a character at that code point. Since chr() function takes an integer argument and converts it to character, there is a valid range for the input.

What is CHR 65 in Python?

chr () in Python For example, chr(65) returns the string 'A', while chr(126) returns the string '~'.

What is CHR () and Ord () in Python?

The Python ord() function converts a character into an integer that represents the Unicode code of the character. Similarly, the chr() function converts a Unicode code character into the corresponding string.


1 Answers

In Python 2.x, a str is a sequence of bytes, so chr() returns a string of one byte and accepts values in the range 0-255, as this is the range that can be represented by a byte. When you print the repr() of a string with a byte in the range 128-255, the character is printed in escape format because there is no standard way to represent such characters (ASCII defines only 0-127). You can convert it to Unicode using unicode() however, and specify the source encoding:

unicode(chr(200), encoding="latin1")

In Python 3.x, str is a sequence of Unicode characters and chr() takes a much larger range. Bytes are handled by the bytes type.

like image 65
kindall Avatar answered Oct 10 '22 01:10

kindall