I'm chasing a couple of potential memory leaks in a Perl code base and I'd like to know about common pitfalls with regards to memory (mis-)management in Perl.
What are common leak patterns you have observed in Perl code?
There are two types of memory leaks: apparent and subtle. An apparent memory leak is a chunk of heap memory that's never referred from active memory, a subtle leak is memory that is still referred to but shouldn't be, i.e. a hash or dynamic array holds the references.
An example of memory leakThe memory leak would occur if the floor number requested is the same floor that the elevator is on; the condition for releasing the memory would be skipped. Each time this case occurs, more memory is leaked. Cases like this would not usually have any immediate effects.
DEFINITION A memory leak is the gradual deterioration of system performance that occurs over time as the result of the fragmentation of a computer's RAM due to poorly designed or programmed applications that fail to free up memory segments when they are no longer needed.
Memory Leaks CausesGlobal variables: JS Global variables are never garbage collected throughout the lifetime of the application since they are referenced by the root node. Due to this, they will occupy memory as long as the application is running.
Circular references are by far the most commonthe canonical cause of leaks.
sub leak {
my ($foo, $bar);
$foo = \$bar;
$bar = \$foo;
}
Perl uses reference counting garbage collection. This means that perl keeps a count of what pointers to any variable exist at a given time. If the variable goes out of scope and the count is 0, the variable is cleared.
In the example code above, $foo
and $bar
are never collected and a copy will persist after every invocation of leak()
because both variables have a reference count of 1.
The easiest way to prevent this issue is to use weak references. Weak references are references that you follow to access data, but do not count for garbage collection.
use Scalar::Util qw(weaken);
sub dont_leak {
my ($foo, $bar);
$foo = \$bar;
$bar = \$foo;
weaken $bar;
}
In dont_leak()
, $foo
has a reference count of 0, $bar
has a ref count of 1. When we leave the scope of the subroutine, $foo
is returned to the pool, and its reference to $bar
is cleared. This drops the ref count on $bar
to 0, which means that $bar
can also return to the pool.
Update: brain d foy asked if I have any data to back up my assertion that circular references are common. No, I don't have any statistics to show that circular references are common. They are the most commonly talked about and best documented form of perl memory leaks.
My experience is that they do happen. Here's a quick rundown on the memory leaks I have seen over a decade of working with Perl.
I've had problems with pTk apps developing leaks. Some leaks I was able to prove were due to circular references that cropped up when Tk passes window references around. I've also seen pTk leaks whose cause I could never track down.
I've seen the people misunderstand weaken
and wind up with circular references by accident.
I've seen unintentional cycles crop up when too many poorly thought out objects get thrown together in a hurry.
On one occasion I found memory leaks that came from an XS module that was creating large, deep data structures. I was never able to get a reproducible test case that was smaller than the whole program. But when I replaced the module with another serializer, the leaks went away. So I know those leaks came from the XS.
So, in my experience cycles are a major source of leaks.
Fortunately, there is a module to help track them down.
As to whether big global structures that never get cleaned up constitute "leaks", I agree with brian. They quack like leaks (we have ever-growing process memory usage due to a bug), so they are leaks. Even so, I don't recall ever seeing this particular problem in the wild.
Based on what I see on Stonehenge's site, I guess brian sees a lot of sick code from people he is training or preforming curative miracles for. So his sample set is easily much bigger and varied than mine, but it has its own selection bias.
Which cause of leaks is most common? I don't think we'll ever really know. But we can all agree that circular references and global data junkyards are anti-patterns that need to be eliminated where possible, and handled with care and caution in the few cases where they make sense.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With