Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A 90/10 Rule for Memory Management? [closed]

Most programmers agree that garbage collection is a great thing, and in most applications is well worth the overhead. However, my personal observation is that memory management for most objects is trivial, and maybe 10%-20% of them account for the need for kludges such as reference counting and really complicated memory management schemes in general. It seems to me like one could get all the benefits of garbage collection with only a small fraction of the overhead by conservatively deleting large objects manually where the object's lifetime is obvious and letting the GC collect the rest, assuming the GC implementation supports such a thing. This would allow the GC to run much less frequently, and consume less excess memory, while still avoiding cases that are actually hard to manage manually. Even more interesting would be if the compiler inserted deterministic delete statements automatically where lifetimes were obvious:

int myFunc() {
    Foo[] foo = new Foo[arbitraryNumber];  // May be too big to stack allocate.
    // do stuff such that the compiler can prove foo doesn't escape.
    // foo is obviously no longer needed, can be automatically deleted here.
    return someInteger;
}

Of course, this might not work well with a copying GC, but for the sake of this post let's assume our GC isn't copying. Why are such hybrid memory management schemes apparently so rare in mainstream programming languages?

like image 224
dsimcha Avatar asked Feb 12 '09 15:02

dsimcha


1 Answers

Because this case is too rare. Almost no method is isolated. They all accept objects from outside or create objects and pass them out.

A method which doesn't access any fields, doesn't has parameters and doesn't return something can't do anything.

Instead, the GCs concentrate on the most common case (the 90%) and try to keep those 90% (the short lived temp objects) in check. This means that in the common case, you have fewer objects to check and the rest doesn't matter that much. Next, you use an incremental sweep (so you can run in little sprints which interrupt only for a short moment).

I once tried to come up with a better GC algorithm and failed miserably. They use an approach that borders on the arcane. The document about Java 5 GC Performance Tuning should give you some ideas. And there is of course the GC article in Wikipedia.

What it boils down to: Using a GC can even be faster than having a traditional memory allocation and freeing schema. Think of the classic algorithm which just locates any reachable object and copies it to a new place. If you have just forgotten about a whole lot of objects (say, 90% of all allocated objects), this algorithm just needs to check the rest (10%). Anything that it can't reach, no matter how much that may be, won't matter. Now you may say that copying is expensive but a) this is not true; today an average desktop CPU can copy 40MB in less than 100ms and b) the copying will protect you against fragmentation so it is actually a good thing.

like image 140
Aaron Digulla Avatar answered Sep 26 '22 00:09

Aaron Digulla