Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does a non-static field not act as a GC root?

As I know static fields (along with Threads, local variables and method arguments, JNI references) act as GC roots.

I cannot provide a link that would confirm this, but I have read a lot of articles on it.

Why can't a non-static field act as a GC root?

like image 696
gstackoverflow Avatar asked Dec 28 '16 16:12

gstackoverflow


People also ask

What are GC roots?

GC root is a term used in the context of garbage collection in Java. They are special objects for the garbage collector. As the name suggests, GC roots are starting points for the garbage collector processes. In general, all objects directly or indirectly referenced from a GC root are not garbage collected.

What is a non static field?

Non-static fields are instance fields of an object. They can only be accessed or invoked through an object reference. The value of static variable remains constant throughout the class. The value of Non-static variables changes as the objects has their own copy of these variables.

What is GC root chain?

A local variable in a method that is currently running is considered to be a GC root. The objects referenced by these variables can always be accessed immediately by the method they are declared in, and so they must be kept around. The lifetime of these roots can depend on the way the program was built.

What is the purpose of a static field?

A static field or class variable is useful in certain programming languages and code situations to assign a particular variable (representing a common characteristic) to all instances of a class, either as a fixed value, or one that could change in the future.


1 Answers

First off, we need to be sure we're on the same page as to what a tracing garbage collection algorithm does in its mark phase.

At any given moment, a tracing GC has a number of objects that are known to be alive, in the sense that they are reachable by the running program as it stands right now. The main step of mark phrase involves following the non-static fields of those objects to find more objects, and those new objects will now also be known to be alive. This step is repeated recursively until no new alive objects are found by traversing the existing live objects. All objects in memory not proved live are considered dead. (The GC then moves to the next phase, which is called the sweep phase. We don't care about that phase for this answer.)

Now this alone is not enough to execute the algorithm. In the beginning, the algorithm has no objects that it knows to be alive, so it can't start following anyone's non-static fields. We need to specify a set of objects that are considered known to be alive from the start. We choose those objects axiomatically, in the sense that they don't come from a previous step of the algorithm -- they come from outside. Specifically, they come from the semantics of the language. Those objects are called roots.

In a language like Java, there are two sets of objects that are definite GC roots. Anything that is accessible by a local variable that's still in scope is obviously reachable (within its method, which still hasn't returned), therefore it's alive, therefore it's a root. Anything that is accessible through a static field of a class is also obviously reachable (from anywhere), therefore it's alive, therefore it's a root.

But if non-static fields were considered roots as well, what would happen?

Say you instantiate an ArrayList<E>. Inside, that object has a non-static field that points to an Object[] (the backing array that represents the storage of the list). At some point, a GC cycle starts. In the mark phase, the Object[] is marked as alive because it is pointed to by the ArrayList<E> private non-static field. The ArrayList<E> is not pointed to by anything, so it fails to be considered alive. Thus, in this cycle, the ArrayList<E> is destroyed while the backing Object[] survives. Of course, at the next cycle, the Object[] also dies, because it is not reachable by any root. But why do this in two cycles? If the ArrayList<E> was dead in the first cycle and if Object[] is used only by a dead object, shouldn't the Object[] also be considered dead in the same move, to save time and space?

That's the point here. If we want to be maximally efficient (in the context of a tracing GC), we need to get rid of as many dead objects as possible in a single GC.

To do that, a non-static field should keep an object alive only if the enclosing object (the object that contains the field) has been proved to be alive. By contrast, roots are objects we call alive axiomatically (without proof) in order to kick-start the algorithm's marking phase. It is in our best interest to limit the latter category to the bare minimum that doesn't break the running program.

For example, say you have this code:

class Foo {
    Bar bar = new Bar();

    public static void main(String[] args) {
        Foo foo = new Foo();
        System.gc();
    }

    public void test() {
        Integer a = 1;
        bar.counter++; //access to the non-static field
    }
}

class Bar {
    int counter = 0;
}
  • When the garbage collection starts, we get one root that's the local variable Foo foo. That's it, that's our only root.
  • We follow the root to find the instance of Foo, which is marked as alive and then we attempt to find its non-static fields. We find one of them, the Bar bar field.
  • We follow the fields to find the instance of Bar, which is marked as alive and then we attempt to find its non-static fields. We find that it contains no more fields that are reference types, so the GC doesn't need to bother for that object anymore.
  • Since we can't get find new alive objects in this round of recursion, the mark phase can end.

Alternatively:

class Foo {
    Bar bar = new Bar();

    public static void main(String[] args) {
        Foo foo = new Foo();
        foo.test();
    }

    public void test() {
        Integer a = 1;
        bar.counter++; //access to the non-static field
        System.gc();
    }
}

class Bar {
    int counter = 0;
}
  • When the garbage collection starts, the local variable Integer a is a root and the Foo this reference (the implicit reference that all non-static methods get) is also a root. The local variable Foo foo from main is also a root because main hasn't gone out of scope yet.
  • We follow the root to find the instance of Integer and instance of Foo (we find one of these objects twice, but this doesn't matter for the algorithm), which are marked as alive and then we attempt to follow their non-static fields. Let's say the instance of Integer has no more fields to class instances. The instance of Foo gives us one Bar field.
  • We follow the field to find the instance of Bar, which is marked as alive and then we attempt to find its non-static fields. We find that it contains no more fields that are reference types, so the GC doesn't need to bother for that object anymore.
  • Since we can't get find new alive objects in this round of recursion, the mark phase can end.
like image 54
Theodoros Chatzigiannakis Avatar answered Sep 23 '22 18:09

Theodoros Chatzigiannakis