Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reasons why one should not call the garbage collector directly

I'm currently writing a paper for my company, about how to avoid calling the garbage collector directly from the code (when playing with COM objects for instance).

I know this is a bad practice, and should be only considered in very rare cases, but I can't seem to find a way to tell why it should be avoided. And I don't want to rely on the "The G.C. is smarter than you" principle (even if it is the truth :-) )

So can you tell me some clues about why you think one should avoid to call the garbage collector directly ? (performance impact?) Or maybe if you have links about this particular topic, they would be very helpful.

Thanks in advance !

Edit: All the aswers you provided until now are really helpful. As I can't validate everyone (or can I ?), what should I do ? Make a community wiki ?

like image 514
Shimrod Avatar asked Jun 16 '10 10:06

Shimrod


2 Answers

The main reason is that a program that spends more time than necessary performing full collections in the GC will be slower than it needs to be.

Given that this is a situation where it is easier to get better performance, seems like a no brainer to me!

NB. When playing with COM objects, calling the GC directly is unlikely to solve your problem. If COM objects are hanging around then they have non-zero ref counts, and no amount of extra GC calls will fix that.

like image 88
Daniel Earwicker Avatar answered Dec 03 '22 08:12

Daniel Earwicker


The usual performance argument runs thus:

Generational GC are fast because they rely on the heuristic that many allocated objects are short-lived (an object is "live" as long as it is reachable; the point of the GC is to detect "dead" objects and reclaim their memory). This means that objects can be accumulated in a special area (the "young generation"); the GC runs when that area is full, and scavenges the live objects, moving them ("physically") into the old generation. In most generational GC, this operation implies a pause ("stop-the-world") which is tolerable because it is short (the young generation is of limited size). The fact that the world is paused during a collection of the young generation allows for efficient handling of young objects (namely, reading or writing a reference in a young object fields is a mere memory access without needing to account for concurrent access from a GC thread or incremental mark&sweep).

A young generation, with a collection ran as I describe above, is efficient because when the young generation is collected, most of the objects in it are already dead, so they incur no extra cost. The optimal size of the young generation is a trade-off between the worst case (all young objects are live, which implies the maximum pause time) and the average efficiency (when the young generation is larger, more objects have time to die before the collection, which lowers the average cost of GC).

Running the GC manually is similar to making the young generation shorter. It means that more young objects will be promoted to the old generation, thus increasing the cost of the collection of the young generation (more objects must be scavenged) and the cost of the collection of the old generation (more old objects to handle).

like image 37
Thomas Pornin Avatar answered Dec 03 '22 06:12

Thomas Pornin