I saw some post about implement GC in C and some people said it's impossible to do it because C is weakly typed. I want to know how to implement GC in C++.
I want some general idea about how to do it. Thank you very much!
This is a Bloomberg interview question my friend told me. He did badly at that time. We want to know your ideas about this.
What is garbage collection in C#? The garbage collector (GC) manages the allocation and release of memory. The garbage collector serves as an automatic memory manager. You do not need to know how to allocate and release memory or manage the lifetime of the objects that use that memory.
C does not have automatic garbage collection. If you lose track of an object, you have what is known as a 'memory leak'. The memory will still be allocated to the program as a whole, but nothing will be able to use it if you've lost the last pointer to it. Memory resource management is a key requirement on C programs.
The mark-and-sweep algorithm is called a tracing garbage collector because it traces out the entire collection of objects that are directly or indirectly accessible by the program. Example: A. All the objects have their marked bits set to false.
There are two reasons why C / C++ doesn't have garbage collection. It is "culturally inappropriate". The culture of these languages is to leave storage management to the programmer. It would be technically difficult (and expensive) to implement a precise garbage collector for C / C++.
Garbage collection in C and C++ are both difficult topics for a few reasons:
Pointers can be typecast to integers and vice-versa. This means that I could have a block of memory that is reachable only by taking an integer, typecasting it to a pointer, then dereferencing it. A garbage collector has to be careful not to think a block is unreachable when indeed it still can be reached.
Pointers are not opaque. Many garbage collectors, like stop-and-copy collectors, like to move blocks of memory around or compact them to save space. Since you can explicitly look at pointer values in C and C++, this can be difficult to implement correctly. You would have to be sure that if someone was doing something tricky with typecasting to integers that you correctly updated the integer if you moved a block of memory around.
Memory management can be done explicitly. Any garbage collector will need to take into account that the user is able to explicitly free blocks of memory at any time.
In C++, there is a separation between allocation/deallocation and object construction/destruction. A block of memory can be allocated with sufficient space to hold an object without any object actually being constructed there. A good garbage collector would need to know, when it reclaims memory, whether or not to call the destructor for any objects that might be allocated there. This is especially true for the standard library containers, which often make use of std::allocator
to use this trick for efficiency reasons.
Memory can be allocated from different areas. C and C++ can get memory either from the built-in freestore (malloc/free or new/delete), or from the OS via mmap
or other system calls, and, in the case of C++, from get_temporary_buffer
or return_temporary_buffer
. The programs might also get memory from some third-party library. A good garbage collector needs to be able to track references to memory in these other pools and (possibly) would have to be responsible for cleaning them up.
Pointers can point into the middle of objects or arrays. In many garbage-collected languages like Java, object references always point to the start of the object. In C and C++ pointers can point into the middle of arrays, and in C++ into the middle of objects (if multiple inheritance is used). This can greatly complicate the logic for detecting what's still reachable.
So, in short, it's extremely hard to build a garbage collector for C or C++. Most libraries that do garbage collection in C and C++ are extremely conservative in their approach and are technically unsound - they assume that you won't, for example, take a pointer, cast it to an integer, write it to disk, and then load it back in at some later time. They also assume that any value in memory that's the size of a pointer could possibly be a pointer, and so sometimes refuse to free unreachable memory because there's a nonzero chance that there's a pointer to it.
As others have pointed out, the Boehm GC does do garbage collection for C and C++, but subject to the aforementioned restrictions.
Interestingly, C++11 includes some new library functions that allow the programmer to mark regions of memory as reachable and unreachable in anticipation of future garbage collection efforts. It may be possible in the future to build a really good C++11 garbage collector with this sort of information. In the meantime though, you'll need to be extremely careful not to break any of the above rules.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With