I have a large array of custom objects which I need to perform independent (parallelizable) tasks on, including modifying object parameters. I've tried using both a Manager().dict, and 'sharedmem'ory, but neither is working. For example:
import numpy as np
import multiprocessing as mp
import sharedmem as shm
class Tester:
num = 0.0
name = 'none'
def __init__(self,tnum=num, tname=name):
self.num = tnum
self.name = tname
def __str__(self):
return '%f %s' % (self.num, self.name)
def mod(test, nn):
test.num = np.random.randn()
test.name = nn
if __name__ == '__main__':
num = 10
tests = np.empty(num, dtype=object)
for it in range(num):
tests[it] = Tester(tnum=it*1.0)
sh_tests = shm.empty(num, dtype=object)
for it in range(num):
sh_tests[it] = tests[it]
print sh_tests[it]
print '\n'
workers = [ mp.Process(target=mod, args=(test, 'some') ) for test in sh_tests ]
for work in workers: work.start()
for work in workers: work.join()
for test in sh_tests: print test
prints out:
0.000000 none
1.000000 none
2.000000 none
3.000000 none
4.000000 none
5.000000 none
6.000000 none
7.000000 none
8.000000 none
9.000000 none
0.000000 none
1.000000 none
2.000000 none
3.000000 none
4.000000 none
5.000000 none
6.000000 none
7.000000 none
8.000000 none
9.000000 none
I.e. the objects aren't modified.
How can I achieve the desired behavior?
The problem is that when the objects are passed to the worker processes, they are packed up with pickle, shipped to the other process, where they are unpacked and worked on. Your objects aren't so much passed to the other process, as cloned. You don't return the objects, so the cloned object are happily modified, and then thrown away.
It looks like this can not be done (Python: Possible to share in-memory data between 2 separate processes) directly.
What you can do is return the modified objects.
import numpy as np
import multiprocessing as mp
class Tester:
num = 0.0
name = 'none'
def __init__(self,tnum=num, tname=name):
self.num = tnum
self.name = tname
def __str__(self):
return '%f %s' % (self.num, self.name)
def mod(test, nn, out_queue):
print test.num
test.num = np.random.randn()
print test.num
test.name = nn
out_queue.put(test)
if __name__ == '__main__':
num = 10
out_queue = mp.Queue()
tests = np.empty(num, dtype=object)
for it in range(num):
tests[it] = Tester(tnum=it*1.0)
print '\n'
workers = [ mp.Process(target=mod, args=(test, 'some', out_queue) ) for test in tests ]
for work in workers: work.start()
for work in workers: work.join()
res_lst = []
for j in range(len(workers)):
res_lst.append(out_queue.get())
for test in res_lst: print test
This does lead to the interesting observation that because the spawned processes are identical, they all start with the same seed for the random number, so they all generate the same 'random' number.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With