Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

If the JVM keeps moving objects around when it does GC, how does it resolve references?

Tags:

I'm reading on JVM tuning, and it occurred to me that the JVM keeps moving objects around when it does GC. But Java Objects have references to each other, which one would presume are implemented as pointers, but the JVM can't possibly go over the whole heap after every time it moved objects around, and update all the references; surely that would take for ever. So how does it resolve references, if the references do not change, but the physical location of the objects do?

I've read a lot about the JVM, but that was never explained, or even hinted at, anywhere.

[EDIT] My point is that references are one-way things. Going from the pointer to the pointed is "instantaneous", but going the other way around would require a full heap scan. While it is possible, it seems unlikely. If 10K objects survive a minor collection, how long would it take to do a full heap scan 10K times to update the references to those objects? There must be some kind of optimized algorithm or structure used.

like image 458
Sebastien Diot Avatar asked Feb 27 '12 13:02

Sebastien Diot


People also ask

How does JVM GC work?

When Java programs run on the JVM, objects are created on the heap, which is a portion of memory dedicated to the program. Eventually, some objects will no longer be needed. The garbage collector finds these unused objects and deletes them to free up memory.

How GC decides if objects are live in Java?

As long as an object is being referenced, the JVM considers it alive. Once an object is no longer referenced and therefore is not reachable by the application code, the garbage collector removes it and reclaims the unused memory.

Does Java GC handle cyclic references?

yes Java Garbage collector handles circular-reference!

Does Java garbage collection use reference counting?

Reference counting collectors keep track of how many references are pointing to each Java object. Once the count for an object becomes zero, the memory can be immediately reclaimed. This immediate access to reclaimed memory is the major advantage of the reference-counting approach to garbage collection.


2 Answers

If you are really interested in how garbage collectors work, can I recommend Richard Jones' 2 books on Garbage Collection. Links / references are here. This isn't specifically about Java garbage collection.

(I have a copy of the older book, and the new one is on my shopping list.)


Here's a simple version of how a copying collector deals with this problem.

A copying collector works by copying objects from one space (the from-space) to another one (the to-space).

Specifically, the GC walks the graph of reachable objects within the "from" space, starting from each of the GC roots. Each time it finds a reference to a node (in an instance field, static field, stack frame, etc), it checks the object that the reference points to to see if it has been marked as visited.

  • If it is not yet marked, the GC does the following:

    1. It marks the object in the from-space.
    2. It copies the object into the to-space.
    3. It stores the address of the object in to space in the from-space object. (This is like a forwarding address.)
    4. It recursively visits each reference field of the to-space copy of the object.

    The result of this the reference to the to-space object.

  • If the object has been marked already, the GC looks up the forwarding address, and returns that.

The location (in to-space, or some GC root) where the GC got the reference from is then updated with the pointer to the object in to-space.

If you follow all of that, then you will see that the GC doesn't need to go looking for all of the places that hold a reference to a given moved object. Instead, it simply encounters all of the places in the traversal of the reachable objects. Of course, the GC does have to do that traversal, but there are various techniques to reduce the amount of traversing that needs to be done in each GC cycle.

If you haven't followed the above, then PLEASE go read one of the textbooks that I've recommended. They'll do a much better job of explaining it than I can do. You'll also find material on how other kinds of GC deal with this issue.


The Java HotSpot GCs are all copying collectors of one form or another. Things get a bit more complicated than my description above for parallel and concurrent collecting, but the "forwarding address" mechanism is common to all of them.

(There are not many published papers or other public documentation on HotSpot GCs, and most of the material that exists assumes that the reader has a good understanding of how modern garbage collectors work.)

like image 92
Stephen C Avatar answered Nov 05 '22 07:11

Stephen C


the JVM can't possibly go over the whole heap after every time it moved objects around, and update all the references

I'm no expert on GC myself, but as far as I know, that is more or less what it does. See e.g. this text:

In contrast, a copying collector copies reachable objects to another region of memory as they are being traversed. [...] After such a traversal all surviving objects reside in a contiguous region of memory, and all pointers have been updated to point to the new object locations. [...] During the process, the GC builds an object graph to track the "live" objects so that it can update references to any objects that it moves.

( http://wiki.osdev.org/Garbage_collection#Copy_collectors , emphasis mine).

As to this "taking for ever" -- the main idea behind a copying (or moving) garbage collector is that only a small amount of objects will actually need to be moved, because most instances are already dead (i.e. most instances are very short-lived). So the number of objects that move is small, and hopefully the number of references pointing to them is also fairly small.

At any rate, the GC must build a list of object references anyway (to find out which objects are still referenced/alive and need to be copied), so it can probably reuse that list to update the references. So the only the updating is "extra work".

like image 24
sleske Avatar answered Nov 05 '22 08:11

sleske