Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Leaking memory with Cocoa garbage collection

I've been beating my head against a wall trying to figure out how I had a memory leak in a garbage collected Cocoa app. (The memory usage in Activity Monitor would just grow and grow, and running the app using the GC Monitor instruments would also show an ever-growing graph.)

I eventually narrowed it down to a single pattern in my code. Data was being loaded into an NSData and then parsed by a C library (the data's bytes and length were passed into it). The C library has callbacks which would fire and return sub-string starting pointers and lengths (to avoid internal copying). However, for my purposes, I needed to turn them into NSStrings and keep them around awhile. I did this by using NSString's initWithBytes:length:encoding: method. I assumed that would copy the bytes and NSString would manage it appropriately, but something is going wrong because this leaks like crazy.

This code will "leak" or somehow trick the garbage collector:

- (void)meh
{
    NSData *data = [NSData dataWithContentsOfFile:[[NSBundle mainBundle] pathForResource:@"holmes" ofType:@"txt"]];
    const int substrLength = 80;

    for (const char *substr = [data bytes]; substr-(const char *)[data bytes] < [data length]; substr += substrLength) {
        NSString *cocoaString = [[NSString alloc] initWithBytes:substr length:substrLength encoding:NSUTF8StringEncoding];
        [cocoaString length];
    }
}

I can put this in timer and just watch memory usage go up and up with Activity Monitor as well as with the GC Monitor instrument. (holmes.txt is 594KB)

This isn't the best code in the world, but it shows the problem. (I'm running 10.6, the project is targeted for 10.5 - if that matters). I read over the garbage collection docs and noticed a number of possible pitfalls, but I don't think I'm doing anything obviously against the rules here. Doesn't hurt to ask, though. Thanks!

Project zip

Here's a pic of the object graph just growing and growing:

alt text

like image 775
Sean Avatar asked Jan 19 '10 17:01

Sean


People also ask

What is a memory leak?

DEFINITION A memory leak is the gradual deterioration of system performance that occurs over time as the result of the fragmentation of a computer's RAM due to poorly designed or programmed applications that fail to free up memory segments when they are no longer needed.

What is garbage collection and how does it work?

Garbage collection is the even stranger term given to the automated process, found in some systems and languages, whereby memory space no longer needed by current applications is consolidated and freed up for reuse. One of the more quaintly mysterious notions in the world of computer software, especially Windows, is the memory leak.

What is the history of garbage collection?

The term garbage collecting appears to have first been used in the Lisp programming language, developed in the 1960s. Some operating systems provide memory leak detection so that a problem can be detected before an application or the operating system crashes.

What are the advantages of memory leak detection?

Some operating systems provide memory leak detection so that a problem can be detected before an application or the operating system crashes. Some program development tools, like Java, also provide automatic housekeeping for the developer. The real advantage to this is that the process happens whether or not the programmer accounts for it.


1 Answers

This is an unfortunate edge case. Please file a bug (http://bugreport.apple.com/) and attach your excellent minimal example.

The problem is two fold;

  • The main event loop isn't running and, thus, the collector isn't triggered via MEL activity. This leaves the collector doing its normal background only threshold based collections.

  • The data stores the data read from the file into a malloc'd buffer that is allocated from the malloc zone. Thus, the GC accounted allocation -- the NSData object itself -- is really tiny, but points to something really large (the malloc allocation). The end result is that the collector's threshold isn't hit and it doesn't collect. Obviously, improving this behavior is desired, but it is a hard problem.

This is a very easy bug to reproduce in a micro-benchmark or in isolation. In practice, there is typically enough going on that this problem won't happen. However, there may be certain cases where it does become problematic.

Change your code to this and the collector will collect the data objects. Note that you shouldn't use collectExhaustively often -- it does eat CPU.

- (void)meh
{
    NSData *data = [NSData dataWithContentsOfFile:[[NSBundle mainBundle] pathForResource:@"holmes" ofType:@"txt"]];
    const int substrLength = 80;

    for (const char *substr = [data bytes]; substr-(const char *)[data bytes] < [data length]; substr += substrLength) {
        NSString *cocoaString = [[NSString alloc] initWithBytes:substr length:substrLength encoding:NSUTF8StringEncoding];
        [cocoaString length];
    }
    [data self];
    [[NSGarbageCollector defaultCollector] collectExhaustively];
}

The [data self] keeps the data object alive after the last reference to it.

like image 88
bbum Avatar answered Nov 02 '22 01:11

bbum