I just introduced threads to a Perl program, where one of its modules was using Memoize. I'm getting this error message:
Thread 1 terminated abnormally: Anonymous function called in forbidden scalar context; faulting
The error occurs if I have both threads and Memoize, but will disappear if I take away one of these elements. But the problem isn't because Memoize isn't thread-safe - in my code, all the memoization happens within the same thread.
Is this a bug with Memoize? Is there a way I can work around this? Otherwise I'm going to get rid of Memoize.
Here's some sample code to isolate the problem:
use strict;
use warnings;
use threads;
use Thread::Semaphore;
use Memoize;
my $semaphore = Thread::Semaphore->new;
memoize('foo');
sub foo {
return shift;
}
sub invoke_foo {
$semaphore->down; # ensure memoization is thread-safe
my $result = foo(@_);
$semaphore->up;
return $result;
}
my @threads;
foreach (1 .. 5) {
my $t = threads->create( sub { invoke_foo($_) });
push @threads, $t;
}
$_->join foreach @threads;
Memoize stores the caches for every memoized function in one hash (instead of using a closure). It uses the address of the function as the index into that hash.
The problem is that the address of the function changes when it's cloned into a new thread. (Add print(\&foo, "\n");
in invoke_foo
.). It's a bug in Memoize.
Workaround: Load the memoised module from within the threads. the following simulates (the relevant aspects of) that:
use strict;
use warnings;
use threads;
use Memoize;
sub foo {
return shift;
}
sub invoke_foo {
return foo(@_);
}
my @threads;
foreach (1 .. 5) {
my $t = threads->create( sub {
memoize('foo');
invoke_foo($_);
});
push @threads, $t;
}
$_->join foreach @threads;
By the way, each thread has its own cache. that could also be considered a bug.
As noted, Memoize
is not thread aware. If you want per thread memoization, ikegami's restructuring will work well. If instead you want global memoization, then replacing Memoize
with something like the following could work:
use strict;
use warnings;
use 5.010;
use threads;
use threads::shared;
sub memoize_shared {
my $name = shift;
my $glob = do {
no strict 'refs';
\*{(caller)."::$name"}
};
my $code = \&$glob;
my $sep = $;;
my (%scalar, %list) :shared;
no warnings 'redefine';
*$glob = sub {
my $arg = join $sep => @_;
if (wantarray) {
@{$list{$arg} ||= sub {\@_}->(&$code)}
}
else {
exists $scalar{$arg}
? $scalar{$arg}
:($scalar{$arg} = &$code)
}
}
}
and to use it:
sub foo {
my $x = shift;
say "foo called with '$x'";
"foo($x)"
}
memoize_shared 'foo';
for my $t (1 .. 4) {
threads->create(sub {
my $x = foo 'bar';
say "thread $t got $x"
})->join
}
which prints:
foo called with 'bar' thread 1 got foo(bar) thread 2 got foo(bar) thread 3 got foo(bar) thread 4 got foo(bar)
The memoize_shared
function above is fairly complicated because it deals with propegating list and scalar contexts as well as replacing the named subroutine. It is sometimes easier to just build the memoziation into the target subroutine:
{my %cache :shared;
sub foo {
my $x = shift;
if (exists $cache{$x}) {$cache{$x}}
else {
say "foo called with '$x'";
$cache{$x} = "foo($x)"
}
}}
Building the memoization into the subroutine does make it a bit more complicated, but it will be faster than using a wrapper function like memoize
. And it gives you exact control over how to memoize the subroutine, including things like using a threads::shared
cache.
Memoize should work under threads, albeit a bit slower:
"There is some problem with the way goto &f works under threaded Perl, perhaps because of the lexical scoping of @_. This is a bug in Perl, and until it is resolved, memoized functions will see a slightly different caller() and will perform a little more slowly on threaded perls than unthreaded perls."
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With