Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do I get an int when I index bytes?

I'm trying to get the first char of a byte-string in python 3.4, but when I index it, I get an int:

>>> my_bytes = b'just a byte string' b'just a byte string' >>> my_bytes[0] 106 >>> type(my_bytes[0]) <class 'int'> 

This seems unintuitive to me, as I was expecting to get b'j'.

I have discovered that I can get the value I expect, but it feels like a hack to me.

>>> my_bytes[0:1] b'j' 

Can someone please explain why this happens?

like image 845
meshy Avatar asked Jan 31 '15 08:01

meshy


People also ask

How to define bytes in PYTHON?

The bytes() function returns a bytes object. It can convert objects into bytes objects, or create empty bytes object of the specified size. The difference between bytes() and bytearray() is that bytes() returns an object that cannot be modified, and bytearray() returns an object that can be modified.

How to define byte array in PYTHON?

Python | bytearray() functionbytearray() method returns a bytearray object which is an array of given bytes. It gives a mutable sequence of integers in the range 0 <= x < 256. Returns: Returns an array of bytes of the given size. source parameter can be used to initialize the array in few different ways.

What is the difference between bytes and bytearray?

bytes is an immutable version of bytearray – it has the same non-mutating methods and the same indexing and slicing behavior. bytearray() function : Return a new array of bytes. The bytearray type is a mutable sequence of integers in the range 0 <= x < 256.

What is byte array data type in PYTHON?

The bytearray type is a mutable sequence of integers in the range between 0 and 255. It allows you to work directly with binary data. It can be used to work with low-level data such as that inside of images or arriving directly from the network. Bytearray type inherits methods from both list and str types.


1 Answers

The bytes type is a Binary Sequence type, and is explicitly documented as containing a sequence of integers in the range 0 to 255.

From the documentation:

Bytes objects are immutable sequences of single bytes.

[...]

While bytes literals and representations are based on ASCII text, bytes objects actually behave like immutable sequences of integers, with each value in the sequence restricted such that 0 <= x < 256[.]

[...]

Since bytes objects are sequences of integers (akin to a tuple), for a bytes object b, b[0] will be an integer, while b[0:1] will be a bytes object of length 1. (This contrasts with text strings, where both indexing and slicing will produce a string of length 1).

Bold emphasis mine. Note than indexing a string is a bit of an exception among the sequence types; 'abc'[0] gives you a str object of length one; str is the only sequence type that contains elements of its own type, always.

This echoes how other languages treat string data; in C the unsigned char type is also effectively an integer in the range 0-255. Many C compilers default to unsigned if you use an unqualified char type, and text is modelled as a char[] array.

like image 172
Martijn Pieters Avatar answered Oct 04 '22 07:10

Martijn Pieters