Logo Questions Linux Laravel Mysql Ubuntu Git Menu

How can I make a numpy ndarray from bytes?




I can convert a numpy ndarray to bytes using myndarray.tobytes() Now how can I get it back to an ndarray?

Using the example from the .tobytes() method docs:

>>> x = np.array([[0, 1], [2, 3]])
>>> bytes = x.tobytes()
>>> bytes

>>> np.some_magic_function_here(bytes)
array([[0, 1], [2, 3]])
like image 962
Jonathan Avatar asked Dec 04 '17 16:12


3 Answers

To deserialize the bytes you need np.frombuffer().
tobytes() serializes the array into bytes and the np.frombuffer() deserializes them.

Bear in mind that once serialized, the shape info is lost, which means that after deserialization, it is required to reshape it back to its original shape.

Below is a complete example:

import numpy as np

x = np.array([[0, 1], [2, 3]], np.int8)
bytes = x.tobytes()
# bytes is a raw array, which means it contains no info regarding the shape of x
# let's make sure: we have 4 values with datatype=int8 (one byte per array's item), therefore the length of bytes should be 4bytes
assert len(bytes) == 4, "Ha??? Weird machine..."

deserialized_bytes = np.frombuffer(bytes, dtype=np.int8)
deserialized_x = np.reshape(deserialized_bytes, newshape=(2, 2))
assert np.array_equal(x, deserialized_x), "Deserialization failed..."
like image 120
Daniel Avatar answered Oct 21 '22 08:10


After your edit it seems you are going into the wrong direction!

You can't use np.tobytes() to store a complete array containing all informations like shapes and types when reconstruction from these bytes only is needed! It will only save the raw data (cell-values) and flatten these in C or Fortran-order.

Now we don't know your task. But you will need something based on serialization. There are tons of approaches, the easiest being the following based on python's pickle (example here: python3!):

import pickle
import numpy as np

x = np.array([[0, 1], [2, 3]])

x_as_bytes = pickle.dumps(x)

y = pickle.loads(x_as_bytes)


[[0 1]
 [2 3]]
 b'\x80\x03cnumpy.core.multiarray\n_reconstruct\nq\x00cnumpy\nndarray\nq\x01K\x00\x85q\x02C\x01bq\x03\x87q\x04Rq\x05(K\x01K\x02K\x02\x86q\x06cnumpy\ndtype\nq\x07X\x02\x00\x00\x00i8q\x08K\x00K\x01\x87q\tRq\n(K\x03X\x01\x00\x00\x00<q\x0bNNNJ\xff\xff\xff\xffJ\xff\xff\xff\xffK\x00tq\x0cb\x89C \x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00q\rtq\x0eb.'
<class 'bytes'>
[[0 1]
 [2 3]]

The better alternative would be joblib's pickle with specialized pickling for large arrays. joblib's functions are file-object based and can be used in-memory with byte-strings too using python's BytesIO.

like image 9
sascha Avatar answered Oct 21 '22 07:10


If you know the dimensions you are recreating ahead of time, do numpy.ndarray(<dimensions>,<dataType>,<bytes(aka buffer)>)

x = numpy.array([[1.0,1.1,1.2,1.3],[2.0,2.1,2.2,2.3],[3.0,3.1,3.2,3.3]],numpy.float64)
#array([[1. , 1.1, 1.2, 1.3],
#       [2. , 2.1, 2.2, 2.3],
#       [3. , 3.1, 3.2, 3.3]])

xBytes = x.tobytes()

newX = numpy.ndarray((3,4),numpy.float64,xBytes)
#array([[1. , 1.1, 1.2, 1.3],
#       [2. , 2.1, 2.2, 2.3],
#       [3. , 3.1, 3.2, 3.3]])

Another approach might be, if you have stored your data as records of bytes rather than as an entire ndarray and your selection of data varies from ndarray to ndarray, you can aggregate your pre-array data as bytes in a python bytearray, then when it is the desired size, you already know the required dimensions, and can supply those dimensions/dataType with the bytearray as a buffer.

like image 1
Old Winterton Avatar answered Oct 21 '22 09:10

Old Winterton