1) Does the multiprocessing
module support Python script files I can use to start a second process instead of a function?
Currently I use multiprocessing.Process
which takes a function but I would like to execute foo.py
instead. I could use subprocess.Popen
but the benefit of multiprocessing.Process
is that I can pass objects (even if they are just pickled).
When I use multiprocessing.Process, why is my_module imported in the child process but print("foo") is not executed?
2) When I use multiprocessing.Process
, why is my_module
imported in the child process but print("foo")
is not executed? How is my_module available although the main scope is not executed?
import multiprocessing
import my_module
print("foo")
def worker():
print("bar")
my_module.foo()
return
p = multiprocessing.Process(target=worker, args=(1,2, d))
p.start()
p.join()
There is no obvious difference between a Python function and a routine you want to run in another process. Functions are just procedures.
Say if another script file (foo.py
in this context) you wished to run in another process has following:
# for demonstration only
from stuff import do_things
a = 'foo'
b = 1
do_things(a, b) # it doesn't matter what this does
You could refactor foo.py
this way
from stuff import do_things
def foo():
a = 'foo'
b = 1
do_things(a, b)
And in the module you are spawning the process:
from foo import foo
p = multiprocess.Process(target=foo)
# ...
Process
API requires that a "callable" is provided as a target
. If say you tried to provided the module foo
(where foo.py
is the first version without a function foo
):
import foo
p = Process(target=foo)
p.start()
You will get a TypeError: 'module' object is not callable
error for a good reason. Imagine when you import foo
module it eagerly executes right away since it's not wrapped inside a function/procedure aka callable
. Try inserting a print statement in a module file and import it. Module-level statements are evaluated right away.
This answers question number 2:
When you imported my_module
at the top level, it's imported once per module, even if worker
was not executed. my_module
was available to worker
because worker
procedure closes over my_module
.
When you pass a subroutine like worker
to a concurrent process, there is no guarantee when it will be called or even will ever be.
You could import a module any where in a Python module, including within a function/subroutine. But doing so in this case might not be optimal or necessary.
You can use multiprocessing.pool() and the pass the function inside the method which you want to execute. I have personally used it as you can split the data into multiple parts and also have the flexibility to use the number of cpu.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With