Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Alternative for Garbage Collector

I'd like to know the best alternative for a garbage collector, with its pros and cons. My priority is speed, memory is less important. If there is garbage collector which doesn't make any pause, let me know.

I'm working on a safe language (i.e. a language with no dangling pointers, checking bounds, etc), and garbage collection or its alternative has to be used.

like image 653
chris Avatar asked Apr 08 '10 06:04

chris


2 Answers

I suspect you will be best sticking with garbage collection (as per the JVM) unless you have a very good reason otherwise. Modern GCs are extremely fast, general purpose and safe. Unless you can design your language to take advantage of a very specific special case (as in one of the above allocators) then you are unlikely to beat the JVM.

The only really compelling reason I see nowadays as an argument against modern GC is latency issues caused by GC pauses. These are small, rare and not really an issue for most purposes (e.g. I've successfully written 3D engines in Java), but they still can cause problems in very tight realtime situations.

Having said that, there may still be some special cases where a different memory allocation scheme may make sense so I've listed a few interesting options below:

An example of a very fast, specialised memory management approach is the "per frame" allocator used in many games. This works by incrementing a single pointer to allocate memory, and at the end of a time period (typically a visual "frame") all objects are discarded at once by simply setting the pointer back to the base address and overwriting them in the next allocation. This can be "safe", however the constraints of object lifetime would be very strict. Might be a winner if you can guarantee that all memory allocation is bounded in size and only valid for the scope of handling e.g. a single server request.

Another very fast approach is to have dedicated object pools for different classes of object. Released objects can just be recycled in the pool, using something like a linked list of free object slots. Operating systems often used this kind of approach for common data structures. Again however you need to watch object lifetime and explicitly handle disposals by returning objects to the pool.

Reference counting looks superficially good but usually doesn't make sense because you frequently have to dereference and update the count on two objects whenever you change a pointer value. This cost is usually worse than the advantage of having simple and fast memory management, and it also doesn't work in the presence of cyclic references.

Stack allocation is extremely fast and can run safely. Depending on your language, it is possible to make do without a heap and run entirely on a stack based system. However I suspect this will somewhat constrain your language design so that might be a non-starter. Still might be worth considering for certain DSLs.

Classic malloc/free is pretty fast and can be made safe if you have sufficient constraints on object creation and lifetime which you may be able to enforce in your language. An example would be if e.g. you placed significant constraints on the use of pointers.

Anyway - hope this is useful food for thought!

like image 176
mikera Avatar answered Sep 16 '22 16:09

mikera


If speed matters but memory does not, then the fastest and simplest allocation strategy is to never free. Allocation is simply a matter of bumping a pointer up. You cannot get faster than that.

Of course, never releasing anything has a huge potential for overflowing available memory. It is very rare that memory is truly "unimportant". Usually there is a large but finite amount of available memory. One strategy is called "region based allocation". Namely you allocate memory in a few big blocks called "regions", with the pointer-bumping strategy. Release occurs only by whole regions. This strategy can be applied with some success if the problem at hand can be structured into successive "tasks", each having its own region.

For more generic solutions, if you want real-time allocation (i.e. guaranteed limits on the response time from allocation requests) then garbage collection is the way to go. A real-time GC may look like this: objects are allocated with a pointer-bumping strategy. Also, on every allocation, the allocator performs a little bit of garbage collection, in which "live" objects are copied somewhere else. In a way the GC runs "at the same time" than the application. This implies a bit of extra work for accessing objects, because you cannot move an object and update all pointers to point to the new object location while keeping the "real-time" promise. Solutions may imply barriers, e.g. an extra indirection. Generational GC allow for barrier-free access to most objects while keeping pause times under strict bounds.

This article is a must-read for whoever wants to study memory allocation, in particular garbage collection.

like image 44
Thomas Pornin Avatar answered Sep 20 '22 16:09

Thomas Pornin