I'm trying to get the first char of a byte-string in python 3.4, but when I index it, I get an int
:
>>> my_bytes = b'just a byte string' b'just a byte string' >>> my_bytes[0] 106 >>> type(my_bytes[0]) <class 'int'>
This seems unintuitive to me, as I was expecting to get b'j'
.
I have discovered that I can get the value I expect, but it feels like a hack to me.
>>> my_bytes[0:1] b'j'
Can someone please explain why this happens?
The bytes() function returns a bytes object. It can convert objects into bytes objects, or create empty bytes object of the specified size. The difference between bytes() and bytearray() is that bytes() returns an object that cannot be modified, and bytearray() returns an object that can be modified.
Python | bytearray() functionbytearray() method returns a bytearray object which is an array of given bytes. It gives a mutable sequence of integers in the range 0 <= x < 256. Returns: Returns an array of bytes of the given size. source parameter can be used to initialize the array in few different ways.
bytes is an immutable version of bytearray – it has the same non-mutating methods and the same indexing and slicing behavior. bytearray() function : Return a new array of bytes. The bytearray type is a mutable sequence of integers in the range 0 <= x < 256.
The bytearray type is a mutable sequence of integers in the range between 0 and 255. It allows you to work directly with binary data. It can be used to work with low-level data such as that inside of images or arriving directly from the network. Bytearray type inherits methods from both list and str types.
The bytes
type is a Binary Sequence type, and is explicitly documented as containing a sequence of integers in the range 0 to 255.
From the documentation:
Bytes objects are immutable sequences of single bytes.
[...]
While bytes literals and representations are based on ASCII text, bytes objects actually behave like immutable sequences of integers, with each value in the sequence restricted such that
0 <= x < 256
[.][...]
Since bytes objects are sequences of integers (akin to a tuple), for a bytes object
b
,b[0]
will be an integer, whileb[0:1]
will be abytes
object of length 1. (This contrasts with text strings, where both indexing and slicing will produce a string of length 1).
Bold emphasis mine. Note than indexing a string is a bit of an exception among the sequence types; 'abc'[0]
gives you a str
object of length one; str
is the only sequence type that contains elements of its own type, always.
This echoes how other languages treat string data; in C the unsigned char
type is also effectively an integer in the range 0-255. Many C compilers default to unsigned
if you use an unqualified char
type, and text is modelled as a char[]
array.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With