Some newer languages are implementing ARC into their compilers (Swift and Rust, to name a couple). As I understand this achieves the same thing as runtime GC (taking the burden of manual deallocation away from the programmer), while being significantly more efficient.
I understand that ARC could become a complex process, but with the complexity of modern garbage collectors it seems like it would be no more complex to implement ARC. However, there are still tons of languages and frameworks using GC for memory management, and even the Go language, which targets systems programming, uses GC.
I really cannot understand why GC would be preferable to ARC. Am I missing something here?
There's a bunch of tradeoffs involved here, it's a complex topic. Here's the big ones though:
GC pros:
- Tracing garbage collectors can handle cycles in object graphs. Automatic reference counting will leak memory unless cycles are manually broken either by removing a reference or figuring out which edge of the graph should be weak. This is quite a common problem in practice in reference counted apps.
- Tracing garbage collectors can actually be moderately faster (in terms of throughput) than reference counting, by doing work concurrently, by batching work up, by deferring work, and by not messing up caches touching reference counts in hot loops.
- Copying collectors can compact the heap, reclaiming fragmented pages to reduce footprint
ARC pros:
- Because object destruction happens immediately when the reference count hits 0, object lifetimes can be used to manage non-memory resources. With garbage collection, lifetimes are non-deterministic, so this isn't safe.
- Collection work is typically more spread out, resulting in much shorter pauses (it's still possible to get a pause if you deallocate a large subgraph of objects)
- Because memory is collected synchronously, it's not possible to "outrun the collector" by allocating faster than it can clean up. This is particularly important when VM paging comes into play, since there are degenerate cases where the GC thread hits a page that's been paged out, and falls far behind.
- On a related note, tracing garbage collectors have to walk the entire object graph, which forces unnecessary page-ins (there are mitigations for this like https://people.cs.umass.edu/~emery/pubs/f034-hertz.pdf, but they're not widely deployed)
- Tracing garbage collectors typically need more "scratch space" than reference counting if they want to hit their full throughput
My personal take on this is that the only two points that really matter for most cases are:
- ARC doesn't collect cycles
- GC doesn't have deterministic lifetimes
I feel that both of these issues are deal breakers, but in the absence of a better idea, you just have to pick which horrifying problem sounds worse to you.