I have heard that it was suboptimal for C to automatically collect garbage — is there any truth to this?
Was there a specific reason garbage collection was not implemented for C?
There are two reasons why C / C++ doesn't have garbage collection. It is "culturally inappropriate". The culture of these languages is to leave storage management to the programmer. It would be technically difficult (and expensive) to implement a precise garbage collector for C / C++.
C does not have automatic garbage collection. If you lose track of an object, you have what is known as a 'memory leak'. The memory will still be allocated to the program as a whole, but nothing will be able to use it if you've lost the last pointer to it. Memory resource management is a key requirement on C programs.
Excessive garbage collection activity can occur due to a memory leak in the Java application. Insufficient memory allocation to the JVM can also result in increased garbage collection activity. And when excessive garbage collection activity happens, it often manifests as increased CPU usage of the JVM!
This is one of many reasons why languages like Java and C# are slower than C and C++ by design. And it is also the reason why C and C++ don't have and never will have a garbage collector, since those languages prioritize execution speed.
Don't listen to the "C is old and that's why it doesn't have GC" folks. There are fundamental problems with GC that cannot be overcome which make it incompatible with C.
The biggest problem is that accurate garbage collection requires the ability to scan memory and identify any pointers encountered. Some higher level languages limit integers not to use all the bits available, so that high bits can be used to distinguish object references from integers. Such languages may then store strings (which could contain arbitrary octet sequences) in a special string zone where they can't be confused with pointers, and all is well. A C implementation, however, cannot do this because bytes, larger integers, pointers, and everything else can be stored together in structures, unions, or as part of chunks returned by malloc
.
What if you throw away the accuracy requirement and decide you're okay with a few objects never getting freed because some non-pointer data in the program has the same bit pattern as these objects' addresses? Now suppose your program receives data from the outside world (network/files/etc.). I claim I can make your program leak an arbitrary amount of memory, and eventually run out of memory, as long as I can guess enough pointers and emulate them in the strings I feed your program. This gets a lot easier if you apply De Bruijn Sequences.
Aside from that, garbage collection is just plain slow. You can find hundreds of academics who like to claim otherwise, but that won't change the reality. The performance issues of GC can be broken down into 3 main categories:
The people who will claim GC is fast these days are simply comparing it to the wrong thing: poorly written C and C++ programs which allocate and free thousands or millions of objects per second. Yes, these will also be slow, but at least predictably slow in a way you can measure and fix if necessary. A well-written C program will spend so little time in malloc
/free
that the overhead is not even measurable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With