I have a 60GB SciPy Array (Matrix) I must share between 5+ multiprocessing
Process
objects. I've seen numpy-sharedmem and read this discussion on the SciPy list. There seem to be two approaches--numpy-sharedmem
and using a multiprocessing.RawArray()
and mapping NumPy dtype
s to ctype
s. Now, numpy-sharedmem
seems to be the way to go, but I've yet to see a good reference example. I don't need any kind of locks, since the array (actually a matrix) will be read-only. Now, due to its size, I'd like to avoid a copy. It sounds like the correct method is to create the only copy of the array as a sharedmem
array, and then pass it to the Process
objects? A couple of specific questions:
What's the best way to actually pass the sharedmem handles to sub-Process()
es? Do I need a queue just to pass one array around? Would a pipe be better? Can I just pass it as an argument to the Process()
subclass's init (where I'm assuming it's pickled)?
In the discussion I linked above, there's mention of numpy-sharedmem
not being 64bit-safe? I'm definitely using some structures that aren't 32-bit addressable.
Are there tradeoff's to the RawArray()
approach? Slower, buggier?
Do I need any ctype-to-dtype mapping for the numpy-sharedmem method?
Does anyone have an example of some OpenSource code doing this? I'm a very hands-on learned and it's hard to get this working without any kind of good example to look at.
If there's any additional info I can provide to help clarify this for others, please comment and I'll add. Thanks!
This needs to run on Ubuntu Linux and Maybe Mac OS, but portability isn't a huge concern.
If you are on Linux (or any POSIX-compliant system), you can define this array as a global variable. multiprocessing
is using fork()
on Linux when it starts a new child process. A newly spawned child process automatically shares the memory with its parent as long as it does not change it (copy-on-write mechanism).
Since you are saying "I don't need any kind of locks, since the array (actually a matrix) will be read-only" taking advantage of this behavior would be a very simple and yet extremely efficient approach: all child processes will access the same data in physical memory when reading this large numpy array.
Don't hand your array to the Process()
constructor, this will instruct multiprocessing
to pickle
the data to the child, which would be extremely inefficient or impossible in your case. On Linux, right after fork()
the child is an exact copy of the parent using the same physical memory, so all you need to do is making sure that the Python variable 'containing' the matrix is accessible from within the target
function that you hand over to Process()
. This you can typically achieve with a 'global' variable.
Example code:
from multiprocessing import Process from numpy import random global_array = random.random(10**4) def child(): print sum(global_array) def main(): processes = [Process(target=child) for _ in xrange(10)] for p in processes: p.start() for p in processes: p.join() if __name__ == "__main__": main()
On Windows -- which does not support fork()
-- multiprocessing
is using the win32 API call CreateProcess
. It creates an entirely new process from any given executable. That's why on Windows one is required to pickle data to the child if one needs data that has been created during runtime of the parent.
@Velimir Mlaker gave a great answer. I thought I could add some bits of comments and a tiny example.
(I couldn't find much documentation on sharedmem - these are the results of my own experiments.)
target
and args
arguments for Process
. This is potentially better than using a global variable.#!/usr/bin/env python from multiprocessing import Process import sharedmem import numpy def do_work(data, start): data[start] = 0; def split_work(num): n = 20 width = n/num shared = sharedmem.empty(n) shared[:] = numpy.random.rand(1, n)[0] print "values are %s" % shared processes = [Process(target=do_work, args=(shared, i*width)) for i in xrange(num)] for p in processes: p.start() for p in processes: p.join() print "values are %s" % shared print "type is %s" % type(shared[0]) if __name__ == '__main__': split_work(4)
values are [ 0.81397784 0.59667692 0.10761908 0.6736734 0.46349645 0.98340718 0.44056863 0.10701816 0.67167752 0.29158274 0.22242552 0.14273156 0.34912309 0.43812636 0.58484507 0.81697513 0.57758441 0.4284959 0.7292129 0.06063283] values are [ 0. 0.59667692 0.10761908 0.6736734 0.46349645 0. 0.44056863 0.10701816 0.67167752 0.29158274 0. 0.14273156 0.34912309 0.43812636 0.58484507 0. 0.57758441 0.4284959 0.7292129 0.06063283] type is <type 'numpy.float64'>
This related question might be useful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With