Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

To GC or Not To GC

I've recently seen two really nice and educating languages talks:

This first one by Herb Sutter, presents all the nice and cool features of C++0x, why C++'s future seems brighter than ever, and how M$ is said to be a good guy in this game. The talk revolves around efficiency and how minimizing heap activity very often improves performance.

This other one, by Andrei Alexandrescu, motivates a transition from C/C++ to his new game-changer D. Most of D's stuff seems really well motivated and designed. One thing, however, surprised me, namely that D pushes for garbage collection and that all classes are created solely by reference. Even more confusing, the book The D Programming Language Ref Manual specifically in the section about Resource Management states the following, quote:

Garbage collection eliminates the tedious, error prone memory allocation tracking code necessary in C and C++. This not only means much faster development time and lower maintenance costs, but the resulting program frequently runs faster!

This conflicts with Sutter's constant talk about minimizing heap activity. I strongly respect both Sutter's and Alexandrescou's insights, so I feel a bit confused about these two key questions

  1. Doesn't creating class instances solely by reference result in a lot of unnecesseary heap activity?

  2. In which cases can we use Garbage Collection without sacrificing run-time performance?

like image 336
Nordlöw Avatar asked Sep 27 '11 20:09

Nordlöw


People also ask

Does garbage collection affect performance?

The most common performance problem associated with Java™ relates to the garbage collection mechanism. If the size of the Java heap is too large, the heap must reside outside main memory. This causes increased paging activity, which affects Java performance.

What does GC copy mean?

Garbage collection (GC) is a dynamic approach to automatic memory management and heap allocation that processes and identifies dead memory blocks and reallocates storage for reuse. The primary purpose of garbage collection is to reduce memory leaks.

How can I improve my GC performance?

Improving GC Performance There are two major ways to do this. First, by adjusting the heap sizes of young and old generations, and second, to reduce the rate of object allocation and promotion. In terms of adjusting heap sizes, it's not as straightforward as one might expect.

What triggers garbage collection?

When a JVM runs out of space in the storage heap and is unable to allocate any more objects (an allocation failure), a garbage collection is triggered. The Garbage Collector cleans up objects in the storage heap that are no longer being referenced by applications and frees some of the space.


2 Answers

To directly answer your two questions:

  1. Yes, creating class instances by reference does result in a lot of heap activity, but:

    a. In D, you have struct as well as class. A struct has value semantics and can do everything a class can, except polymorphism.

    b. Polymorphism and value semantics have never worked well together due to the slicing problem.

    c. In D, if you really need to allocate a class instance on the stack in some performance-critical code and don't care about the loss of safety, you can do so without unreasonable hassle via the scoped function.

  2. GC can be comparable to or faster than manual memory management if:

    a. You still allocate on the stack where possible (as you typically do in D) instead of relying on the heap for everything (as you often do in other GC'd languages).

    b. You have a top-of-the-line garbage collector (D's current GC implementation is admittedly somewhat naive, though it has seen some major optimizations in the past few releases, so it's not as bad as it was).

    c. You're allocating mostly small objects. If you allocate mostly large arrays and performance ends up being a problem, you may want to switch a few of these to the C heap (you have access to C's malloc and free in D) or, if it has a scoped lifetime, some other allocator like RegionAllocator. (RegionAllocator is currently being discussed and refined for eventual inclusion in D's standard library).

    d. You don't care that much about space efficiency. If you make the GC run too frequently to keep the memory footprint ultra-low, performance will suffer.

like image 200
dsimcha Avatar answered Oct 05 '22 18:10

dsimcha


The reason creating an object on the heap is slower than creating it on the stack is that the memory allocation methods need to deal with things like heap fragmentation. Allocating memory on the stack is as simple as incrementing the stack pointer (a constant-time operation).

Yet, with a compacting garbage collector, you don't have to worry about heap fragmentation, heap allocations can be as fast as stack allocations. The Garbage Collection page for the D Programming Language explains this in more detail.

The assertion that GC'd languages run faster is probably assuming that many programs allocate memory on the heap much more often than on the stack. Assuming that heap allocation could be faster in a GC'd language, then it follows that you have just optimized a huge part of most programs (heap allocation).

like image 23
Jack Edmonds Avatar answered Oct 05 '22 17:10

Jack Edmonds