Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

importing and using a module that uses multiprocessing without causing infinite loop on Windows

I have a module named multi.py. If I simply wanted to execute multi.py as a script, then the workaround to avoid crashing on Windows (spawning an infinite number of processes) is to put the multiprocessing code under:

if __name__ == '__main__':

However, I am trying to import it as a module from another script and call multi.start(). How can this be accomplished?

# multi.py
import multiprocessing

def test(x):
    x**=2

def start():
    pool = multiprocessing.Pool(processes=multiprocessing.cpu_count()-2)
    pool.map(test, (i for i in range(1000*1000)))
    pool.terminate()
    print('done.')

if __name__ == '__main__':
    print('runs as a script,',__name__)
else:
    print('runs as imported module,',__name__)

This is my test.py I run:

# test.py
import multi
multi.start()
like image 284
dmi Avatar asked Jul 06 '12 18:07

dmi


People also ask

Does Python multiprocessing work on Windows?

The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows.

Which is the method used to change the default way to create child processes in multiprocessing?

Python provides the ability to create and manage new processes via the multiprocessing. Process class. In multiprocessing programming, we may need to change the technique used to start child processes. This is called the start method.

How do you use multiprocessing in Python?

In this example, at first we import the Process class then initiate Process object with the display() function. Then process is started with start() method and then complete the process with the join() method. We can also pass arguments to the function using args keyword.

Why is multiprocessing slow in Python?

The multiprocessing version is slower because it needs to reload the model in every map call because the mapped functions are assumed to be stateless. The multiprocessing version looks as follows. Note that in some cases, it is possible to achieve this using the initializer argument to multiprocessing.


1 Answers

I don't quite get what you're asking. You don't need to do anything to prevent this from spawning infinitely many processes. I just ran it on Windows XP --- imported the file and ran multi.start() --- and it completed fine in a couple seconds.

The reason you have to do the if __name__=="__main__" protection is that, on Windows, multiprocessing has to import the main script in order to run the target function, which means top-level module code in that file will be executed. The problem only arises if that top-level module code itself tries to spawn a new process. In your example, the top level module code doesn't use multiprocessing, so there's no infinite process chain.

Edit: Now I get what you're asking. You don't need to protect multi.py. You need to protect your main script, whatever it is. If you're getting a crash, it's because in your main script you are doing multi.start() in the top level module code. Your script needs to look like this:

import multi
if __name__=="__main__":
    multi.start()

The "protection" is always needed in the main script.

like image 104
BrenBarn Avatar answered Sep 18 '22 14:09

BrenBarn