Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Importing Modules that use MultiProcessing Python

I am looking to use the multiprocessing module to speed up the run time of some Transport Planning models. I've optimized as much as I can via 'normal' methods but at the heart of it is an absurdly parallel problem. Eg Perform the same set of matrix operations four 4 different sets of inputs, all independent information.

Pseudo Code:

    for mat1,mat2,mat3,mat4 in zip([a1,a2,a3,a4],[b1,b2,b3,b4],[c1,c2,c3,c4],[d1,d2,d3,d4]):
        result1 = mat1*mat2^mat3
        result2 = mat1/mat4
        result3 = mat3.T*mat2.T+mat4

So all I really want to do is process the iterations of this loop in parallel on a quad core computer. I've read up here and other places on the multiprocessing module and it seems to fit the bill perfectly except for the required:

   if __name__ == '__main__'

From what I understand this means that you can only multiprocess code run from a script? ie if I do something like:

    import multiprocessing
    from numpy.random import randn

    a = randn(100,100)
    b = randn(100,100)
    c = randn(100,100)
    d = randn(100,100)

    def process_matrix(mat):
        return mat^2

    if __name__=='__main__':
        print "Multiprocessing"
        jobs=[]

        for input_matrix in [a,b,c,d]:
            p = multiprocessing.Process(target=process_matrix,args=(input_matrix,))
            jobs.append(p)
            p.start()

It runs fine, however assuming I saved the above as 'matrix_multiproc.py', and defined a new file 'importing_test.py' which just states:

    import matrix_multiproc

The multiprocessing does not happen because the name is now 'matrix_multiproc' and not 'main'

Does this mean I can never use parallel processing on an imported module? All I am trying to do is have my model run as:

    def Model_Run():
        import Part1, Part2, Part3, matrix_multiproc, Part4

        Part1.Run()
        Part2.Run()
        Part3.Run()
        matrix_multiproc.Run()
        Part4.Run()

Sorry for a really long question to what is probably a simple answer, thanks!

like image 203
JMJR Avatar asked Nov 03 '11 00:11

JMJR


People also ask

How do I import a multiprocessing module in Python?

In this example, at first we import the Process class then initiate Process object with the display() function. Then process is started with start() method and then complete the process with the join() method. We can also pass arguments to the function using args keyword.

Can you import multiple modules in Python?

You can import multiple functions, variables, etc. from the same module at once by writing them separated by commas. If a line is too long, you can use parentheses () to break the line.

Is multiprocessing a standard Python library?

multiprocessing has been distributed as part of the standard library since python 2.6.

Can two modules import each other?

Modules can import each other cyclically, but there's a catch. In the simple case, it should work by moving the import statements to the bottom of the file or not using the from syntax.


1 Answers

Does this mean I can never use parallel processing on an imported module?

No, it doesn't. You can use multiprocessing anywhere in your code, provided that the program's main module uses the if __name__ == '__main__' guard.

On Unix systems, you won't even need that guard, since it features the fork() system call to create child processes from the main python process.

On Windows, on the other hand, fork() is emulated by multiprocessing by spawning a new process that runs the main module again, using a different __name__. Without the guard here, your main application will try to spawn new processes again, resulting in an endless loop, and eating up all your computer's memory pretty fast.

like image 57
Ferdinand Beyer Avatar answered Oct 27 '22 00:10

Ferdinand Beyer