I have written a little perl script that starts a program multiple times, with different parameters in a for loop. The program does a numerical calculation and uses a whole CPU if it can get one. I have several CPUs available, so ideally, I want to start as many instances of the program at once as there are available CPUs, but not more. Since there may be other processes running, the number of available CPUs is not always the same.
What I have done so far is:
#!/usr/bin/perl
use strict;
use warnings;
use IPC::Open2;
use Parallel::ForkManager;
my $program = "./program";
my($out, $in);
my $pid;
my $pm = new Parallel::ForkManager(44);
for my $x (0..100){
my $childpid = $pm->start and next;
$pid= open2($out, $in, $program);
print $in <<EOF;
#input involving $x
EOF
my $printstring = "";
while(<$out>){
if (/^\s*1\.000\s+(-\S+)D(\S+)\s*$/){
$printstring .= "$1e$2";
}
}
print $printstring, "\n";
waitpid( $pid, 0 );
$pm->finish;
}
$pm->wait_all_children;
print "\n\n END\n";
This obviously contains a fixed number of processes to start, and thereby a fixed number of CPUs that can be used, and I have no idea how to go about changing this to flexibly determine the available CPUs and change the number of children accordingly. Any ideas how to do this?
Update:
Just to be clear, the limiting factor here is definitely the CPU time and not I/O stuff.
I looked into loadavg, but I am confused by its output.
68.71 66.40 63.72 70/1106 19247
At the same time, top showed
Tasks: 978 total, 23 running, 955 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.1%us, 1.5%sy, 93.3%ni, 3.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
The number of CPUs is 48, so I would have thought that if the fourth number (in this case 70) is greater than 48, I should not start any more child processes, but according to top there seems to be some idle CPU there, although the fourth number is 70.
I'm going to suggest taking a slightly different tack - how about, instead of 'throttling' your number of active processes based on load - why not instead make use of SIGSTOP and SIGCONT.
Parallel::ForkManager gives you running_procs method which returns a list of PIDs.
You can then signal those to STOP when the load average gets 'too high'.
You can find "too high" using Sys::Info::CPU (This also tells you load) or - perhaps look at Number of processors/cores in command line
But notionally - when load goes too high, issue 'SIGSTOP' to some of your child processes. They should drop out of the run queue, and be visible but suspended.
In terms of load average - you get 3 numbers. 1m, 5m and 15m CPU load. Look at the first, and if that's greater than the number of CPUs, you have contention.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With