Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiprocessing - Shared Array

So I'm trying to implement multiprocessing in python where I wish to have a Pool of 4-5 processes running a method in parallel. The purpose of this is to run a total of thousand Monte simulations (250-200 simulations per process) instead of running 1000. I want each process to write to a common shared array by acquiring a lock on it as soon as its done processing the result for one simulation, writing the result and releasing the lock. So it should be a three step process :

  1. Acquire lock
  2. Write result
  3. Release lock for other processes waiting to write to array.

Everytime I pass the array to the processes each process creates a copy of that array which I donot want as I want a common array. Can anyone help me with this by providing sample code?

like image 344
Vedant7 Avatar asked Aug 24 '16 11:08

Vedant7


People also ask

Does Python multiprocessing use shared memory?

multiprocessing is a drop in replacement for Python's multiprocessing module. It supports the exact same operations, but extends it, so that all tensors sent through a multiprocessing. Queue , will have their data moved into shared memory and will only send a handle to another process.

Can two processes share memory?

Processes don't share memory with other processes. Threads share memory with other threads of the same process.

What is multiprocess join?

Python multiprocessing joinThe join method blocks the execution of the main process until the process whose join method is called terminates. Without the join method, the main process won't wait until the process gets terminated.

What is multiprocessing value?

The multiprocessing. Value class is used to share a ctype of a given type among multiple processes. The multiprocessing. Array class is used to share an array of ctypes of a given type among multiple processes. Both the Value and Array classes are aliases for classes in the multiprocessing.


1 Answers

Since you're only returning state from the child process to the parent process, then using a shared array and explicity locks is overkill. You can use Pool.map or Pool.starmap to accomplish exactly what you need. For example:

from multiprocessing import Pool

class Adder:
    """I'm using this class in place of a monte carlo simulator"""

    def add(self, a, b):
        return a + b

def setup(x, y, z):
    """Sets up the worker processes of the pool. 
    Here, x, y, and z would be your global settings. They are only included
    as an example of how to pass args to setup. In this program they would
    be "some arg", "another" and 2
    """
    global adder
    adder = Adder()

def job(a, b):
    """wrapper function to start the job in the child process"""
    return adder.add(a, b)

if __name__ == "__main__":   
    args = list(zip(range(10), range(10, 20)))
    # args == [(0, 10), (1, 11), ..., (8, 18), (9, 19)]

    with Pool(initializer=setup, initargs=["some arg", "another", 2]) as pool:
        # runs jobs in parallel and returns when all are complete
        results = pool.starmap(job, args)

    print(results) # prints [10, 12, ..., 26, 28] 
like image 88
Dunes Avatar answered Sep 20 '22 00:09

Dunes