Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Objects lifespan in Java vs .Net

I was reading "CLR via C#" and it seems that in this example, the object that was initially assigned to 'obj' will be eligible for Garbage Collection after line 1 is executed, not after line 2.

void Foo()
{
    Object obj = new Object();
    obj = null;
}

That's because local variable lifespan defined not by scope in which it was defined, but by last time you read it.

So my question is: what about Java? I have written this program to check such behavior, and it looks like object stays alive. I don't think that it's possible for the JVM to limit variable lifetime while interpreting bytecode, so I tried to run program with 'java -Xcomp' to force method compilation, but 'finalize' is not called anyway. Looks like that's not true for Java, but I hope I can get a more accurate answer here. Also, what about Android's Dalvik VM?

class TestProgram {

    public static void main(String[] args) {
        TestProgram ref = new TestProgram();
        System.gc();
    }

    @Override
    protected void finalize() {
        System.out.println("finalized");
    }
}

Added: Jeffrey Richter gives code example in "CLR via C#", something like this:

public static void Main (string[] args)
{
    var timer = new Timer(TimerCallback, null, 0, 1000); // call every second
    Console.ReadLine();
}

public static void TimerCallback(Object o)
{
    Console.WriteLine("Callback!");
    GC.Collect();
}

TimerCallback called only once on MS .Net if projects target is 'Release' (timer destroyed after GC.Collect() call), and called every second if target is 'Debug' (variables lifespan increased because programmer can try to access object with debugger). But on Mono callback called every second no matter how you compile it. Looks like Mono's 'Timer' implementation stores reference to instance somewhere in thread pool. MS implementation doesn't do this.

like image 348
StaceyGirl Avatar asked Jan 11 '12 17:01

StaceyGirl


1 Answers

Note that just because an object can be collected, doesn't mean that it will actually be collected an any given point - so your method can give false negatives. If any object's finalize method is called you can definitely say that it was unreachable, but if the method is not called you can't logically infer anything. As with most GC-related questions, the non-determinism of the garbage collector makes it hard to come up with tests/guarantees about exactly what it will do.

On the topic of reachability/collectability, the JLS says (12.6.1):

A reachable object is any object that can be accessed in any potential continuing computation from any live thread. Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.

Which is more or less exactly what you'd expect - I think the above paragraph is isomorphic with "an object is unreachable iff you definitely won't use it any more".

Going back to your original situation, can you think of any practical ramifications between the object being deemed unreachable after line 1 as opposed to line 2? My initial reaction is that there are none, and if you somehow managed to find such a situation it would likely be a mark of bad/twisted code causing the VM to struggle rather than an inherent weakness in the language.

Though I'm open to counter-arguments.


Edit: Thanks for the interesting example.

I agree with your assessment and see where you're going, though the issue is probably more that debug mode is subtly changing the semantics of your code.

In the code as written, you assign the Timer to a local variable which is not subsequently read within its scope. Even the most trivial escape analysis can reveal that the timer variables isn't used anywhere else in the main method, and so can be elided. Hence I think your first line can be considered exactly equivalent to just invoking the constructor directly:

public static void Main (string[] args)
{
    new Timer(TimerCallback, null, 0, 1000); // call every second
    ...

In this latter case it's clear that the newly created Timer object isn't reachable immediately after construction (assuming that it doesn't do anything sneaky like add itself to static fields etc. in its constructor); and so that it would be collected as soon as the GC got round to it.

Now in the debug case things are subtly different, for the reason you've mentioned which is that the developer may wish to inspect the state of the local variables later on in the method. Therefore the compiler (and the JIT compiler) can't optimise these away; it's as if there is an access of the variable at the end of the method, preventing collection until that point.

Even so, I don't think this actually changes the semantics. The nature of GC is that collection is seldom guaranteed (in Java at least, the only guarantee you get is that if an OutOfMemoryError was thrown then everything deemed unreachable was GCed immediately beforehand). In fact, assuming you had enough heap space to hold every object created during the lifetime of the runtime, a no-op GC implementation is perfectly valid. So while you might observe behavioural changes in how many times the Timer ticks, this is fine as there are no guarantees about what you'll see based on the way you're invoking it. (This is conceptually similar to how a timer that runs throughout a CPU-intensive task would tick more times when the system was under load - neither outcome is wrong because the interface doesn't provide that sort of guarantee.)

At this point I refer you back to the first sentence in this answer. :)

like image 184
Andrzej Doyle Avatar answered Sep 27 '22 18:09

Andrzej Doyle