I have a main Python process, and a bunch or workers created by the main process using os.fork()
.
I need to pass large and fairly involved data structures from the workers back to the main process. What existing libraries would you recommend for that?
The data structures are a mix of lists, dictionaries, numpy
arrays, custom classes (which I can tweak) and multi-layer combinations of the above.
Disk I/O should be avoided. If I could also avoid creating copies of the data -- for example by having some kind of shared-memory solution -- that would be nice too, but is not a hard constraint.
For the purposes of this question, it is mandatory that the workers are created using os.fork()
, or a wrapper thereof that would clone the master process's address space.
This only needs to work on Linux.
In Python, we have something known as Child Process. Now, what it is? Child Process is a process that contains complete data as that of the Parent Process which is the main process. The Child Process is just a copy of the Parent process.
Each thread has its own unique ID, and the threads of the same process share the process instructions and the data. One can use pdb on the main process and winpdb on the fork to debug a fork.pdb is the standard library for debugging winpdb is an advanced debugger available in python which supports multiple threading and breakpoints.
To add threading or forking, a new class can be created using the appropriate mix-in form. os.kill () is the method available in Python which is used to send signal to the specified process with the given process id. In this article, we learned all about forking in python.
Child Process is a process that contains complete data as that of the Parent Process which is the main process. The Child Process is just a copy of the Parent process. It copies local variables of Parent Process and any changes done in the Parent Process do not appear in the Child Process. Each process runs until it finishes and then exit.
multiprocessing
's queue implementation works. Internally, it pickles data to a pipe.
q = multiprocessing.Queue()
if (os.fork() == 0):
print(q.get())
else:
q.put(5)
# outputs: 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With