Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

can a GC be implemented with C++ raw pointers?

I was wondering how a Garbage Collector can be implemented with C++ full power of pointers arithmetic. Also, In languages like Java, I can not assign literal addresses to references. In C++ it is very flexible.

I believe that C# has both, but again, the unsafe pointer in C# is the responsibility of the programmer.

EITD :: guys, I am asking if C++ pointers 'as they are currently' can be GCed in theory or not ?

like image 554
Khaled Alshaya Avatar asked Jun 28 '09 01:06

Khaled Alshaya


2 Answers

Pointer arithmetic isn't the fundamental problem. GC's have to deal with pointers being reassigned all the time, and pointer arithmetic is just another example of that. (Of course, if pointer arithmetic between pointers pointing to different buffers was allowed, it would cause problems, but it isn't. The only arithmetics you're allowed to perform on a pointer pointing into an array A are the ones that repositions it within that array.

The real problem is the lack of metadata. A GC has to know what is a pointer and what isn't.

If it encounters the value 0x27a2c230, it has to be able to determine if it is

  • a pointer (in which case it has to follow the pointer to mark the destination as "in use" recursively)
  • An integer (The same value is a perfectly valid integer. Perhaps it's not a pointer at all)
  • or something else, say, a bit of a string.

It also has to be able to determine the extent of a struct. Assuming that value is a pointer, and it points into another struct, the GC has to be able to determine the size and extent of that struct, so it knows which range of addresses should be scanned for more pointers.

GC'ed languages have a lot of infrastructure to deal with this. C++ doesn't.

Boehm's GC is the closest you can generally get, and it it is conservative in that if something might be a pointer, the GC assumes it to be one, which means some data is needlessly kept alive. And so, it will likely keep data alive which should be GC'ed.

Alternatively, of course all this infrastructure could in principle be added to a C++ compiler. There's no rule in the standard that it's not allowed to exist. The problem is that it would be a major performance hit and eliminate a lot of optimization opportunities.

like image 115
jalf Avatar answered Sep 21 '22 11:09

jalf


Yes. There was a system at NeXT back in the 1990s that worked like this: they keep track of every C/C++ memory block that has been allocated. Periodically they scan memory to see if the 4-byte (or 8-byte) value associated with that memory address is present. If it isn't, they assume there are no references, and they free the memory. It's neat and easy! But sometimes it screws up.

Here are some other approaches: http://www.hpl.hp.com/personal/Hans_Boehm/gc/

http://developers.sun.com/solaris/articles/libgc.html

like image 35
vy32 Avatar answered Sep 21 '22 11:09

vy32