Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Parallel::ForkManager() module support synchronization on global variables?

I'm very new to this Parallel::ForkManager module in Perl and it has a lot of credits, so I think it supports what I need and I just haven't figured out yet.

What I need to do is in each child process, it writes some updates into a global hash map, according to the key value computed in each child process. However, when I proceed to claim a hash map outside the for loop and expect the hash map is updated after the loop, it turns out that the hash map stays empty. This means although the update inside the loop succeeds (by printing out the value), outside the loop it is not.

Does anybody know how to write such a piece of code that does what I want?

like image 203
galactica Avatar asked Feb 19 '10 01:02

galactica


3 Answers

This isn't really a Perl-specific problem, but a matter of understanding Unix-style processes. When you fork a new process, none of the memory is shared by default between processes. There are a few ways you can achieve what you want, depending on what you need.

One easy way would be to use something like BerkeleyDB to tie a hash to a file on disk. The tied hash can be initialized before you fork and then each child process would have access to it. BerkeleyDB files are designed to be safe to access from multiple processes simultaneously.

A more involved method would be to use some form of inter-process communication. For all the gory details of achieving such, see the perlipc manpage, which has details on several IPC methods supported by Perl.

A final approach, if your Perl supports it, is to use threads and share variables between them.

like image 119
friedo Avatar answered Nov 10 '22 16:11

friedo


Each fork call generates a brand new process, so updates to a hash variable in a child process are not visible in the parent (and changes to the parent after the fork call are not visible in the child).

You could use threads (and see also threads::shared) to have a change written in one thread be writeable in another thread.

Another option is to use interprocess communication to pass messages between parent and child processes. The Forks::Super module (of which I am the author) can make this less of a headache.

Or your child processes could write some output to files. When the parent process reaps them, it could load the data from those files and update its global hash map accordingly.

like image 35
mob Avatar answered Nov 10 '22 17:11

mob


Read the "RETRIEVING DATASTRUCTURES from child processes" section from man Parallel::ForkManager. There are callbacks, child's data can be sent and parent can retrieve them and populate data structures.

like image 2
psena Avatar answered Nov 10 '22 16:11

psena