I wish to make a Perl program of mine use multiple cores. It progressively reads query input and compares chunks of that against a read-only data-structure loaded from file into memory for each run. That data-structure, which is typically a few giga-bytes, is a small set of packed strings that are used in small C-routines. When processes are forked, everything is copied, which on a multi-core machine quickly blows the RAM. I tried several non-standard modules, but all leads to slowness and/or blows the RAM. I thought, for read-only data, that Perl would not insist on making copies. Other languages can do it. Does anyone have ideas?
Fork doesn't normally copy memory until it's modified (search for copy on write or COW). Are you sure you are measuring memory usage correctly? Subtract before/after values from free rather than using top.
EDIT - example script
Try running the following with settings like: ./fork_mem_usage 5 10000 ./fork_mem_usage 25 10000 ./fork_mem_usage 5 100000 ./fork_mem_usage 25 100000
If the first increase is bigger than the subsequent ones then fork is using copy-on-write. It almost certainly is (except for Windows of course).
#!/usr/bin/perl
use strict;
use warnings;
my $num_kids = shift @ARGV;
my $arr_size = shift @ARGV;
print "$num_kids x $arr_size\n";
my @big_array = ('abcdefg') x $arr_size;
die "Array wrong length" unless ($arr_size == @big_array);
print_mem_usage('Start');
for my $i (1..$num_kids) {
my $pid = fork();
if ($pid) {
if ($i % 5 == 0) {
print_mem_usage($i);
}
}
else {
sleep(5);
exit;
}
}
print_mem_usage('End');
exit;
sub print_mem_usage {
my $msg = shift;
print "$msg: ";
system q(free -m | grep buffers/cache | awk '{print $3}');
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With