So I wish to create a process using the python multiprocessing module, I want it be part of a larger script. (I also want a lot of other things from it but right now I will settle for this)
I copied the most basic code from the multiprocessing docs and modified it slightly
However, everything outside of the if __name__ == '__main__':
statement gets repeated every time p.join() is called.
This is my code:
from multiprocessing import Process
data = 'The Data'
print(data)
# worker function definition
def f(p_num):
print('Doing Process: {}'.format(p_num))
print('start of name == main ')
if __name__ == '__main__':
print('Creating process')
p = Process(target=f, args=(data,))
print('Process made')
p.start()
print('process started')
p.join()
print('process joined')
print('script finished')
This is what I expected:
The Data
start of name == main
Creating process
Process made
process started
Doing Process: The Data
process joined
script finished
Process finished with exit code 0
This is the reality:
The Data
start of name == main
Creating process
Process made
process started
The Data <- wrongly repeated line
start of name == main <- wrongly repeated line
script finished <- wrongly executed early line
Doing Process: The Data
process joined
script finished
Process finished with exit code 0
I am not sure whether this is caused by the if
statement or p.join()
or something else and by extension why this is happening. Can some one please explain what caused this and why?
For clarity because some people cannot replicate my problem but I have; I am using Windows Server 2012 R2 Datacenter and I am using python 3.5.3.
The way Multiprocessing works in Python is such that each child process imports the parent script. In Python, when you import a script, everything not defined within a function is executed. As I understand it, __name__
is changed on an import of the script (Check this SO answer here for a better understanding), which is different than if you ran the script on the command line directly, which would result in __name__ == '__main__'
. This import results in __name__
not equalling '__main__'
, which is why the code in if __name__ == '__main__':
is not executed for your subprocess.
Anything you don't want executed during subprocess calls should be moved into your if __name__ == '__main__':
section of your code, as this will only run for the parent process, i.e. the script you run initially.
Hope this helps a bit. There are some more resources around Google that better explain this if you look around. I linked the official Python resource for the multiprocessing module, and I recommend you look through it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With