Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Common Perl memory/reference leak patterns?

I'm chasing a couple of potential memory leaks in a Perl code base and I'd like to know about common pitfalls with regards to memory (mis-)management in Perl.

What are common leak patterns you have observed in Perl code?

like image 773
knorv Avatar asked Feb 08 '10 18:02

knorv


People also ask

What are the types of memory leaks?

There are two types of memory leaks: apparent and subtle. An apparent memory leak is a chunk of heap memory that's never referred from active memory, a subtle leak is memory that is still referred to but shouldn't be, i.e. a hash or dynamic array holds the references.

What are memory leaks example?

An example of memory leakThe memory leak would occur if the floor number requested is the same floor that the elevator is on; the condition for releasing the memory would be skipped. Each time this case occurs, more memory is leaked. Cases like this would not usually have any immediate effects.

What is the main cause of memory leaks?

DEFINITION A memory leak is the gradual deterioration of system performance that occurs over time as the result of the fragmentation of a computer's RAM due to poorly designed or programmed applications that fail to free up memory segments when they are no longer needed.

Do global variables cause memory leaks?

Memory Leaks CausesGlobal variables: JS Global variables are never garbage collected throughout the lifetime of the application since they are referenced by the root node. Due to this, they will occupy memory as long as the application is running.


1 Answers

Circular references are by far the most commonthe canonical cause of leaks.

sub leak {
    my ($foo, $bar);
    $foo = \$bar;
    $bar = \$foo;
}

Perl uses reference counting garbage collection. This means that perl keeps a count of what pointers to any variable exist at a given time. If the variable goes out of scope and the count is 0, the variable is cleared.

In the example code above, $foo and $bar are never collected and a copy will persist after every invocation of leak() because both variables have a reference count of 1.

The easiest way to prevent this issue is to use weak references. Weak references are references that you follow to access data, but do not count for garbage collection.

use Scalar::Util qw(weaken);

sub dont_leak {
    my ($foo, $bar);
    $foo = \$bar;
    $bar = \$foo;
    weaken $bar;
}

In dont_leak(), $foo has a reference count of 0, $bar has a ref count of 1. When we leave the scope of the subroutine, $foo is returned to the pool, and its reference to $bar is cleared. This drops the ref count on $bar to 0, which means that $bar can also return to the pool.

Update: brain d foy asked if I have any data to back up my assertion that circular references are common. No, I don't have any statistics to show that circular references are common. They are the most commonly talked about and best documented form of perl memory leaks.

My experience is that they do happen. Here's a quick rundown on the memory leaks I have seen over a decade of working with Perl.

I've had problems with pTk apps developing leaks. Some leaks I was able to prove were due to circular references that cropped up when Tk passes window references around. I've also seen pTk leaks whose cause I could never track down.

I've seen the people misunderstand weaken and wind up with circular references by accident.

I've seen unintentional cycles crop up when too many poorly thought out objects get thrown together in a hurry.

On one occasion I found memory leaks that came from an XS module that was creating large, deep data structures. I was never able to get a reproducible test case that was smaller than the whole program. But when I replaced the module with another serializer, the leaks went away. So I know those leaks came from the XS.

So, in my experience cycles are a major source of leaks.

Fortunately, there is a module to help track them down.

As to whether big global structures that never get cleaned up constitute "leaks", I agree with brian. They quack like leaks (we have ever-growing process memory usage due to a bug), so they are leaks. Even so, I don't recall ever seeing this particular problem in the wild.

Based on what I see on Stonehenge's site, I guess brian sees a lot of sick code from people he is training or preforming curative miracles for. So his sample set is easily much bigger and varied than mine, but it has its own selection bias.

Which cause of leaks is most common? I don't think we'll ever really know. But we can all agree that circular references and global data junkyards are anti-patterns that need to be eliminated where possible, and handled with care and caution in the few cases where they make sense.

like image 65
daotoad Avatar answered Sep 28 '22 03:09

daotoad