Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a garbage collection algorithm that meets these requirements?

I am writing a compiler for a statically-typed object-oriented language. Currently I'm researching garbage collection algorithms to use. I'm wondering if there's a collector that is:

  • Open source and documented, so that I can implement it.
  • Acurrate
  • Generational
  • Global, i.e there is only one collector per process, as opposed to say one per thread.
  • Incremental and/or concurrent, to avoid long pauses from major collections.
  • Fits with this programming paradigm. An example of what doesn't would be a collector which becomes very slow in presence of destructive assignment.

Edit: To clarify, I was wondering if there's an implementable algorithm that does this, not if there's an off-the-shelf collector.

like image 301
keiter Avatar asked Feb 01 '11 13:02

keiter


People also ask

What algorithm is used for garbage collection?

The mark-and-sweep algorithm is called a tracing garbage collector because it traces out the entire collection of objects that are directly or indirectly accessible by the program.

Do we have any control over garbage collection algorithm?

We cannot force the garbage collector to collect the garbage, it depends on the JVM. If the Heap Memory is full, the JVM will not allow to create a new object and shows an error java.

Is DFS algorithm is used for garbage collection?

The garbage collector will start traversal from all references(on stack, registers, static variables) in the program and visit all the inner references in a Depth First Search (DFS) manner and mark objects as reachable. Every allocated object on the heap has a flag let's call marked set to false when it is allocated.

What are the three possible conditions for a garbage collection?

Conditions for a garbage collectionThe system has low physical memory. The memory size is detected by either the low memory notification from the operating system or low memory as indicated by the host. The memory that's used by allocated objects on the managed heap surpasses an acceptable threshold.


1 Answers

There's one not-at-all-experimental garbage collection algorithm that actually meets all your requirements: simple automatic refcounting. On the whole, refcounting really doesn't get enough credit as a viable option, but actually it works really nicely in many situations, there are never any big batch delays, and there's no need for complicated magic.

One concern is still cleaning up circular references, which you can at least leave to be done extremely rarely; app developers who care about speed can just explicitly break the loops when they need the objects to go away.

A little-appreciated feature of refcounting is that it's much more dcache-friendly than other forms of garbage collection. If you're running a loop that allocates some small temporary objects every time through the loop, a refcounting GC (or explicit memory management, of course) can reuse the same memory each time, avoiding unnecessary cache flushes. Any other kind of GC would only free up the objects periodically, resulting in a much bigger memory footprint and therefore slowness.

Refcounting is not very efficient for heavily multi-threaded systems, because you need to acquire locks every time you touch the refcount. But if you're designing a new language anyhow, there's one huge thing you can do to improve performance and reliability all over your language: prevent almost all objects from being shared between threads. ie. make sharing explicit. If you do that, you will know which objects are vs. aren't shared, and therefore which ones need to be locked when incrementing/decrementing the refcount and which can be left unlocked. When there isn't any locking, refcounting performance can be really excellent.

like image 128
apenwarr Avatar answered Sep 19 '22 18:09

apenwarr