Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it really a bug in JIT optimization or am I missing something?

Tags:

c#

jit

[TestFixture]
public class Tests
{
    private class Relay
    {
        public Action Do { get; set; }
    }

    [Test]
    public void OptimizerStrangeness()
    {
        var relay = new Relay();
        var indicator = 0;
        relay.Do = () => indicator++;
        var weak = new WeakReference(relay);

        GC.Collect();

        var relayNew = weak.Target as Relay;
        if (relayNew == null) Assert.Fail();
        relayNew.Do();
        Assert.AreEqual(1, indicator);
    }
}

This code fails only in Release mode at Assert.Fail() line despite the fact relay variable is still in scope and thus we still have strong reference to the instance, so WeakReference must not be dead yet.

UPD: To clarify a bit: I realize that it can be 'optimized away'. But depending on this optimization indicator variable would have 0 or 1 value, i.e. we have actual visible change of behavior.

UPD2: From C# language specification, section 3.9

If the object, or any part of it, cannot be accessed by any possible continuation of execution, other than the running of destructors, the object is considered no longer in use, and it becomes eligible for destruction. The C# compiler and the garbage collector may choose to analyze code to determine which references to an object may be used in the future. For instance, if a local variable that is in scope is the only existing reference to an object, but that local variable is never referred to in any possible continuation of execution from the current execution point in the procedure, the garbage collector may (but is not required to) treat the object as no longer in use.

Technically speaking, this object can and will be accessed by continuation of execution and thus can't be treated as 'no longer in use' (actually C# spec says nothing about weak references because it is aspect of CLR and not the compiler - compiler output is fine). Will try to search for memory management info about CLR/JIT.

UPD3: Here is some info on CLR's memory management - section 'Releasing memory':

...Every application has a set of roots. Each root either refers to an object on the managed heap or is set to null. An application's roots include global and static object pointers, local variables and reference object parameters on a thread's stack, and CPU registers. The garbage collector has access to the list of active roots that the just-in-time (JIT) compiler and the runtime maintain. Using this list, it examines an application's roots, and in the process creates a graph that contains all the objects that are reachable from the roots.

Variable in question is definitely local variable, hence it is reachable. Thus said, this mention is very quick/vague, so I would be really glad to see more concrete info.

UPD4: From sources of .NET Framework:

    // This method DOES NOT DO ANYTHING in and of itself.  It's used to
    // prevent a finalizable object from losing any outstanding references 
    // a touch too early.  The JIT is very aggressive about keeping an 
    // object's lifetime to as small a window as possible, to the point
    // where a 'this' pointer isn't considered live in an instance method 
    // unless you read a value from the instance.  So for finalizable
    // objects that store a handle or pointer and provide a finalizer that
    // cleans them up, this can cause subtle ----s with the finalizer
    // thread.  This isn't just about handles - it can happen with just 
    // about any finalizable resource.
    // 
    // Users should insert a call to this method near the end of a 
    // method where they must keep an object alive for the duration of that
    // method, up until this method is called.  Here is an example: 
    //
    // "...all you really need is one object with a Finalize method, and a
    // second object with a Close/Dispose/Done method.  Such as the following
    // contrived example: 
    //
    // class Foo { 
    //    Stream stream = ...; 
    //    protected void Finalize() { stream.Close(); }
    //    void Problem() { stream.MethodThatSpansGCs(); } 
    //    static void Main() { new Foo().Problem(); }
    // }
    //
    // 
    // In this code, Foo will be finalized in the middle of
    // stream.MethodThatSpansGCs, thus closing a stream still in use." 
    // 
    // If we insert a call to GC.KeepAlive(this) at the end of Problem(), then
    // Foo doesn't get finalized and the stream says open. 
    [System.Security.SecuritySafeCritical]  // auto-generated
    [ResourceExposure(ResourceScope.None)]
    [MethodImplAttribute(MethodImplOptions.InternalCall)]
    [ReliabilityContract(Consistency.WillNotCorruptState, Cer.Success)] 
    public static extern void KeepAlive(Object obj);

See here for my investigation in more details, if you're interested.

like image 904
Ivan Danilov Avatar asked Aug 29 '12 15:08

Ivan Danilov


1 Answers

Even if a variable is in scope, the runtime is free to collect it if it's no longer accessed in any future code path. This is why when debugging an assembly with JIT optimizations in place can give you a message about a variable's value being optimized away even though it's currently in scope.

See item 2 on 3.9 Automatic Memory Management.

To be specific,

For instance, if a local variable that is in scope is the only existing reference to an object, but that local variable is never referred to in any possible continuation of execution from the current execution point in the procedure, the garbage collector may (but is not required to) treat the object as no longer in use.

This is the crucial point; the object is considered to be inaccessible because all strong references to the object (there's only one) are unreachable. Remember that the C# spec will contain information about the language as well as information about how the compiled code should execute. Also remember that reachability is not defined by scope. As the spec states, if the compiler and runtime can determine that a variable is not present in any future code path (meaning they are never referenced at all or only referenced in paths that are determined to be unreachable, like if(false)), then the variable is considered unreachable and does not count as a strong reference.

While that particular portion of the specification does not state WeakReference explicitly, it doesn't need to. You have only one local variable that ever points to that value as far as the compiler is concerned.

WeakReference is just another class that takes an object as an argument; from the perspective of the compiler; it has no reason to believe (or make an assumption one way or the other) about whether or not that class holds on to the reference that it's passed. Consider if I had a class like this that was used instead:

public class MyClass
{
    public MyClass(object foo)
    {
        Console.WriteLine(foo);
    }
}

And in my code I did this:

var relay = new Relay();
...
var myClass = new MyClass(relay);

I haven't introduced any new strong references to the value I assigned to relay, as MyClass doesn't hold on to that reference. The fact that WeakReference is a "special" class that is designed to give you a reference to an object that doesn't count as a strong reference is irrelevant as far as the compiler is concerned.

Reachability is not defined by scope; it's defined by whether or not the variable (not value) in question lies in any possible future code path. As relay is not present in any form later in the function, the variable (and thus its reference to the object) is considered unreachable and eligible for collection. This is the reason the DisableOptimizations flag exists at the assembly level so that the runtime knows (among other things) to wait until the variable falls out of scope before it's eligible for collection so that it is accessible to the debugger.

like image 61
Adam Robinson Avatar answered Oct 02 '22 06:10

Adam Robinson