I am trying to use multiprocessing for the first time and having some fairly basic issues. I have a toy example below, where two processes are adding data to a list:
def add_process(all_nums_class, numbers_to_add):
for number in numbers_to_add:
all_nums_class.all_nums_list.append(number)
class AllNumsClass:
def __init__(self):
self.all_nums_list = []
all_nums_class = AllNumsClass()
p1 = Process(target=add_process, args=(all_nums_class, [1,3,5]))
p1.start()
p2 = Process(target=add_process, args=(all_nums_class, [2,4,6]))
p2.start()
all_nums_class.all_nums_list
I'd like to have the all_nums_class shared between these processes so that they can both add to its all_nums_list - so the result should be
[1,2,3,4,5,6]
instead of what I'm currently getting which is just good old
[]
Could anybody please advise? I have played around with namespace a bit but I haven't yet made it work here.
I feel I'd better mention (in case it makes a difference) that I'm doing this on Jupyter notebook.
You can either use a multiprocessing Queue or a Pipe to share data between processes. Queues are both thread and process safe. You will have to be more careful when using a Pipe as the data in a pipe may become corrupted if two processes (or threads) try to read from or write to the same end of the pipe at the same time. Of course there is no risk of corruption from processes using different ends of the pipe at the same time.
Currently, your implementation spawns two separate processes each with its own self.all_nums_list
. So you're essentially spawning three objects of AllNumsClass
: One in your main program, one in p1
, and one in p2
. Since processes are independent and don't share the same memory space, they are appending correctly but its appending to its own self.all_nums_list
for each process. That's why when you print all_nums_class.all_nums_list
in your main program, you're printing the main processes' self.all_nums_list
which is a empty list. To share the data and have the processes append to the same list, I would recommend using a Queue.
Example using Queue and Process
import multiprocessing as mp
def add_process(queue, numbers_to_add):
for number in numbers_to_add:
queue.put(number)
class AllNumsClass:
def __init__(self):
self.queue = mp.Queue()
def get_queue(self):
return self.queue
if __name__ == '__main__':
all_nums_class = AllNumsClass()
processes = []
p1 = mp.Process(target=add_process, args=(all_nums_class.get_queue(), [1,3,5]))
p2 = mp.Process(target=add_process, args=(all_nums_class.get_queue(), [2,4,6]))
processes.append(p1)
processes.append(p2)
for p in processes:
p.start()
for p in processes:
p.join()
output = []
while all_nums_class.get_queue().qsize() > 0:
output.append(all_nums_class.get_queue().get())
print(output)
This implementation is asynchronous as it does not apply in sequential order. Every time you run it, you may get different outputs.
Example outputs
[1, 2, 3, 5, 4, 6]
[1, 3, 5, 2, 4, 6]
[2, 4, 6, 1, 3, 5]
[2, 1, 4, 3, 5, 6]
A simpler way to maintain an ordered or unordered list of results is to use the mp.Pool class. Specifically, the Pool.apply
and the Pool.apply_async
functions. Pool.apply
will lock the main program until all processes are finished, which is quite useful if we want to obtain results in a particular order for certain applications. In contrast, Pool.apply_async
will submit all processes at once and retrieve the results as soon as they are finished. An additional difference is that we need to use the get
method after the Pool.apply_async
call in order to obtain the return values of the finished processes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With