Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Precise mode in Boehm Garbage Collector

I've read on the webpage of Mono that they are using the Boehm GC in precise mode. I too use the Boehm GC with C++, however, I have found nothing in its documentation or headers that would indicate a precise mode, much less how to turn it on.

Any information whether it actually has a precise mode by default and how to turn it on, or it was just some kind of modification by Mono developers?

like image 369
Frigo Avatar asked Jul 21 '11 10:07

Frigo


People also ask

How Boehm garbage collector works?

The collector uses a mark-sweep algorithm. It provides incremental and generational collection under operating systems which provide the right kind of virtual memory support. (Currently this includes SunOS[45], IRIX, OSF/1, Linux, and Windows, with varying restrictions.)

Why C++ not provide specifically garbage collector as in Java?

C++ was built with competitors in mind that did not have garbage collection. Efficiency was the main concern that C++ had to fend off criticism from in comparison to C and others. If you want it you can use it, if you don't want it you aren't forced into using it.

What does garbage collector do?

In the common language runtime (CLR), the garbage collector (GC) serves as an automatic memory manager. The garbage collector manages the allocation and release of memory for an application. For developers working with managed code, this means that you don't have to write code to perform memory management tasks.


2 Answers

Precise mode in Boehm GC under Mono isn't just GC_MALLOC_ATOMIC. It's only true for arrays of fundamental types.

For managed types, GC_gcj_malloc is used. Mono's compiler generates an object descriptor for every managed type and it then simply calls GC_gcj_malloc with an argument of size, and a pointer to the managed type's descriptor. Boehm GC then refers to the descriptor during mark phase to trace the managed pointers.

You will end up with just the root pointers sitting on the stack as raw pointers (GC_gcj_malloc returns a void* and there's no way to tell the GC where the pointers are on the stack via some sort of a stack descriptor prior to GC collect). This is the reason Mono (prior to SGen) says they scan the stack in conservative mode.

If you want to implement this under C++, you won't be able to simply rely on the C++ compiler to generate the object descriptor for you. What I envisioned a long time ago was to write an intermediate compiler that parses all your C++ header files for class definitions that have been marked as managed class (e.g. _ref class MyManagedObject where _ref is simply a #define to nothing) and generate a header file containing those object descriptors. You would then use the GC_make_descriptor and GC_malloc_explicitly_typed functions to allocate your objects in precise mode rather than GC_gcj_malloc as you would not have control over how your C++ compiler allocates its vtable.

*EDIT: See Managed C++ for GCC (open source GPL v3).

like image 194
Zach Saw Avatar answered Nov 10 '22 00:11

Zach Saw


The file doc/gcinterface.html from the garbage collector (archive here) states:

void * GC_MALLOC_ATOMIC(size_t nbytes) Allocates nbytes of storage. Requires (amortized) time proportional to nbytes. The resulting object will be automatically deallocated when unreferenced. The client promises that the resulting object will never contain any pointers. The memory is not cleared. This is the preferred way to allocate strings, floating point arrays, bitmaps, etc. More precise information about pointer locations can be communicated to the collector using the interface in gc_typed.h in the distribution.

It looks like there is a "precise" interface that can be used.

like image 32
gfour Avatar answered Nov 10 '22 01:11

gfour