Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiprocessing or os.fork, os.exec?

Tags:

python

I am using multiprocessing module to fork child processes. Since on forking, child process gets the address space of parent process, I am getting the same logger for parent and child. I want to clear the address space of child process for any values carried over from parent. I got to know that multiprocessing does fork() at lower level but not exec(). I want to know whether it is good to use multiprocessing in my situation or should I go for os.fork() and os.exec() combination or is there any other solution?

Thanks.

like image 439
alice Avatar asked Dec 16 '22 14:12

alice


1 Answers

Since multiprocessing is running a function from your program as if it were a thread function, it definitely needs a full copy of your process' state. That means doing fork().

Using a higher-level interface provided by multiprocessing is generally better. At least you should not care about the fork() return code yourself.

os.fork() is a lower level function providing less service out-of-the-box, though you certainly can use it for anything multiprocessing is used for... at the cost of partial reimplementation of multiprocessing code. So, I think, multiprocessing should be ok for you.

However, if you process' memory footprint is too large to duplicate it (or if you have other reasons to avoid forking -- open connections to databases, open log files etc.), you may have to make the function you want to run in a new process a separate python program. Then you can run it using subprocess, pass parameters to its stdin, capture its stdout and parse the output to get results.

UPD: os.exec... family of functions is hard to use for most of purposes since it replaces your process with a spawned one (if you run the same program as is running, it will restart from the very beginning, not keeping any in-memory data). However, if you really do not need to continue parent process execution, exec() may be of some use.

From my personal experience: os.fork() is used very often to create daemon processes on Unix; I often use subprocess (the communication is through stdin/stdout); almost never used multiprocessing; not a single time in my life I needed os.exec...().

like image 135
Ellioh Avatar answered Dec 21 '22 09:12

Ellioh