Does the scope of a numpy ndarray function differently within a function called by multiprocessing? Here is an example:
Using python's multiprocessing module I am calling a function like so:
for core in range(cores):
#target could be f() or g()
proc = mp.Process(target=f, args=(core))
jobs.append(proc)
for job in jobs:
job.start()
for job in jobs:
job.join()
def f(core):
x = 0
x += random.randint(0,10)
print x
def g(core):
#Assume an array with 4 columns and n rows
local = np.copy(globalshared_array[:,core])
shuffled = np.random.permutation(local)
Calling f(core)
, the x
variable is local to the process, ie. it prints a different, random integer as expected. These never exceed 10, indicating that x=0
in each process. Is that correct?
Calling g(core)
and permuting a copy of the array returns 4 identically 'shuffled' arrays. This seems to indicate that the working copy is not local the child process. Is that correct? If so, other than using sharedmemory space, is it possible to have an ndarray be local to the child process when it needs to be filled from shared memory space?
EDIT:
Altering g(core)
to add a random integer appears to have the desired effect. The array's show a different value. Something must be occurring in permutation
that is randomly ordering the columns (local to each child process) the same...ideas?
def g(core):
#Assume an array with 4 columns and n rows
local = np.copy(globalshared_array[:,core])
local += random.randint(0,10)
EDIT II:
np.random.shuffle
also exhibits the same behavior. The contents of the array are shuffling, but are shuffling to the same value on each core.
Calling g(core) and permuting a copy of the array returns 4 identically 'shuffled' arrays. This seems to indicate that the working copy is not local the child process.
What it likely indicates is that the random number generator is initialized identically in each child process, producing the same sequence. You need to seed each child's generator (perhaps throwing the child's process id into the mix).
To seed a random array this post was most useful. The following g(core)
function succeeded in generating a random permutation for each core.
def g(core):
pid = mp.current_process()._identity[0]
randst = np.random.mtrand.RandomState(pid)
randarray = randst.randint(0,100, size=(1,100)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With