Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the Perl techniques to detach just a portion of code to run independently?

Tags:

fork

process

perl

I'm not involved in close-to-OS programming techniques, but as I know, when it comes to doing something in parallel in Perl the weapon of choice is fork and probably some useful modules built upon it. The doc page for fork says:

Does a fork(2) system call to create a new process running the same program at the same point.

As a consequence, having a big application that consumes a lot of memory and calling fork for a small task means there will be 2 big perl processes, and the second will waste resources just to do some simple work.

So, the question is: what to do (or how to use fork, if it's the only method) in order to have a detached portion of code running independently and consuming just the resources it needs?

Just a very simpel example:

    use strict;
    use warnings;

    my @big_array = ( 1 .. 2000000 );  # at least 80 MB memory
    sleep 10;  # to have time to inspect easely the memory usage

    fork();
    sleep 10;  # to have time to inspect easely the memory usage

and the child process consumes 80+ MB too.

To be clear: it's not important to communicate to this detached code or to use its result somehow, just to be possible to say "hey, run for me this simple task in the background and let me continue my heavy work meanwhile ... and don't waste my resources!" when running a heavy perl application.

like image 461
ArtMat Avatar asked Feb 19 '13 16:02

ArtMat


3 Answers

fork() to exec() is your bunny here. You fork() to create a new process (which is a fairly cheap operation, see below), then exec() to replace the big perl you've got running with something smaller. This looks like this:

use strict;
use warnings;
use 5.010;

my @ary = (1 .. 10_000_000);

if (my $pid = fork()) {
    # parent
    say "Forked $pid from $$; sleeping";
    sleep 1_000;
} else {
    # child
    exec('perl -e sleep 1_000');
}

(@ary was just used to fill up the original process' memory a bit.)

I said that fork()ing was relatively cheap, even though it does copy the entire original process. These statements are not in conflict; the guys who designed fork noticed this same problem. The copy is lazy, that is, only the bits that are actually changed are copied.

If you find you want the processes to talk to each other, you'll start getting into the more complex domain of IPC, about which a number of books have been written.

like image 113
darch Avatar answered Nov 02 '22 20:11

darch


Your forked process is not actually using 80MB of resident memory. A large portion of that memory will be shared - 'borrowed' from the parent process until either the parent or child writes to it, at which point copy-on-write semantics will cause the memory to actually be copied.

If you want to drop that baggage completely, run exec in your fork. That will replace the child Perl process with a different executable, thus freeing the memory. It's also perfect if you don't need to communicate anything back to the parent.

like image 3
rjh Avatar answered Nov 02 '22 19:11

rjh


There is no way to fork just a subset of your process's footprint, so the usual workarounds come down to:

  1. fork before you run memory intensive code in the parent process
  2. start a separate process with system or open HANDLE,'|-',.... Of course this new process won't inherit any data from its parent, so you will need to pass data to this child somehow.
like image 1
mob Avatar answered Nov 02 '22 18:11

mob