While researching Unicode issues in Python3, I can across this often-quoted document which lays out the initial ideas behind Python3 Unicode support. A quote from that page:
For historical reasons, the most widely used encoding is ascii, which can only handle Unicode code points in the range 0-0xEF (i.e. ASCII is a 7-bit encoding).
I understand that 0xEF
= 14*16 + 15*1 = 239
. This seems wrong to me, as binary 1111111
(7 bits) is 127
. Is this quote wrong, or is my understanding wrong?
UPDATE: The document has been fixed! Thanks to Nick Coghlan for his excellent introduction to Python 3 string handling, and to bobince for his help in confirming the typo.
Yes, 0xEF
appears to be a simple typo. The section makes perfect sense with that replaced by 0x7F
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With