Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RawArray from numpy array?

I want to share a numpy array across multiple processes. The processes only read the data, so I want to avoid making copies. I know how to do it if I can start with a multiprocessing.sharedctypes.RawArray and then create a numpy array using numpy.frombuffer. But what if I am initially given a numpy array? Is there a way to initialize a RawArray with the numpy array's data without copying the data? Or is there another way to share the data across the processes without copying it?

like image 775
christianbrodbeck Avatar asked Oct 10 '14 15:10

christianbrodbeck


1 Answers

To my knowledge it is not possible to declare memory as shared after it was assigned to a specific process. Similar discussions can be found here and here (more suitable).

Let me quickly sketch the workaround you mentioned (starting with a RawArray and get a numpy.ndarray refference to it).

import numpy as np
from multiprocessing.sharedctypes import RawArray
# option 1
raw_arr = RawArray(ctypes.c_int, 12)
# option 2 (set is up, similar to some existing np.ndarray np_arr2)
raw_arr = RawArray(
        np.ctypeslib.as_ctypes_type(np_arr2.dtype), len(np_arr2)
        )
np_arr = np.frombuffer(raw_arr, dtype=np.dtype(raw_arr))
# np_arr: numpy array with shared memory, can be processed by multiprocessing

If you have to start with a numpy.ndarray, you have no other choice as to copy the data

import numpy as np
from multiprocessing.sharedctypes import RawArray

np_arr = np.zeros(shape=(3, 4), dtype=np.ubyte)
# option 1
tmp = np.ctypeslib.as_ctypes(np_arr)
raw_arr = RawArray(tmp._type_, tmp)
# option 2
raw_arr = RawArray(np.ctypeslib.as_ctypes_type(np_arr.dtype), np_arr.flatten())

print(raw_arr[:])
like image 105
Markus Dutschke Avatar answered Sep 23 '22 09:09

Markus Dutschke