My understanding is that GCs like ParallelGC and G1 are "generational" collectors. Garbage Collection almost happens as a byproduct, since you move all live objects to a new heap region and anything left in the old region will simply be overwritten. This "byproduct" explanation makes a lot of sense, except for the part where Java needs to call finalize() on the dead objects. Does Java also keep a separate list of all objects in each heap region that it can compare agains the live objects?
Yes, a GC
keeps track of all these Objects and their types.
As a matter of fact there is a dedicated phase for GCs that deals just with these special references: WeakReference
, SoftReference
, PhantomReference
and the artificial Finalizer
s. Some call it Cleanup phase
, some Reference Processing
; as part of these there are Pre-cleapup
and Post-cleanup
phases.
But the idea is that when a GC
encounters such a "special" reference during the mark phase, it keeps an eye on these. First it tracks them separately (think: registers them in a special List
). When the mark phase is done (at least for some GC
s), it will analyze these references under a pause (stop-the-world). Some of them are not that complicated to work with: WeakReference
s and SoftReference
s are the easiest ones: if the referent
is weakly/softly reachable, reclaim it and send a special event to the ReferenceQueue
. PhantomReference
s are almost the same (there is a diff between java-8 and 9, but will not go into details).
... where Java needs to call finalize() on the dead objects
You are sort of right here. The ugliest one is Finalizers
, mainly because a GC
has to resurrect the dead Object that it got, since it needs to call finalize
on an instance and that instance is unreachable, or dead; but the GC can't reclaim it. So a GC
first revives the Object, only to kill it in the immediately in the next cycle that will work on this instance. It does not have to be the second, it could be the 100-th cycle in general; but it has to be the second that involves this particular instance.
Does Java also keep a separate list of all objects in each heap region that it can compare against the live objects?
Think about it for a moment -> A list of all objects in a heap, where could you find something like this? The answer is quite simple and straightforward, the place where you can find all objects in heap is heap.
Garbage Collection almost happens as a byproduct, since you move all live objects to a new heap region and anything left in the old region will simply be overwritten. This "byproduct" explanation makes a lot of sense, except for the part where Java needs to call finalize() on the dead objects.
Why would that be a problem? As you've rightly pointed out all live objects are getting processed (Either moved to next heap space, or aged). During garbage collection (both minor and major one) you are checking the references for all objects in processed heap space (you do not know which ones are live/dead before checking), that means that you know exactly which ones are live and which ones are dead afterwards -> what stops you from calling finalize() for dead objects? You access them straight from heap so you can do that.
Also as a resource explaining Garbage collection in more detail, I still find Java Garbage Collection Basics to be quite nice, especially given its step by step example of generational garbage collection.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With