Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# BeforeFieldInit Jon Skeet explanation confusion

I read through Jon Skeet's article about beforefieldinit and I stumbled upon a question. He mentions that the type initializer can be invoked at any time before the first reference to a static field is called.

This is my test code:

class Test1
{
    public static string x1 = EchoAndReturn1("Init x1");

    public static string EchoAndReturn1(string s)
    {
        Console.WriteLine(s);
        return s;
    }
}

class Programm
{
    public static void Main()
    {
        Console.WriteLine("Starting Main");

        Test1.EchoAndReturn1("Echo 1");

        Console.WriteLine("After echo");

        string y = Test1.x1;  //marked line
    }
}

The output is:

Init x1
Starting Main
Echo 1
After echo

But without the marked line, so without the invocation of the static field x1, the output is:

Starting Main
Init x1
Echo 1
After echo

So the invocation of objects that are flagged with beforefieldinit affects the invocation of their type initializers? Or does this belong to the strange effect of beforefieldinit he mentioned?

So, beforefieldinit can make the invocation of the type initializer even lazier or more eager.

like image 382
L. Guthardt Avatar asked Nov 21 '17 14:11

L. Guthardt


1 Answers

I'm not sure what question is being asked here; perhaps if I explain what is going on in the mind of the jitter, that will answer the question.

Static classes with an explicit static constructor have "strict" semantics about when the cctor runs: it runs immediately before the first use of a member of the type. So if you have

if (whatever) x = Foo.Bar;

then the cctor for Foo is NOT run if whatever is false, because we have not yet encountered an actual use of a member.

Think about what this must mean for the jitted code. How would you write a jitter for a language that has this requirement?

For static method calls you could put a little prequel at every call site that checks if the cctor has been run. But that makes every call site bigger and slower.

You could put the prequel into the static method itself. That would keep the call sites small, but every call would still get slightly slower.

Or, you could be clever and put the check in the jitter the first time the static method is jitted. That way you only get the cost of the check once, and call sites stay small. The jit cost gets larger, but only by a tiny fraction; jitting is already expensive.

Notice however that doing so precludes any optimization that causes a method to be jitted before its first call, because such an optimization now introduces a correctness problem. Optimization almost always involves trade-offs!

But for field accesses, there's no method to jit. The jitter would have to put a little prequel in front of every access to the field that could possibly be the first. So accessing a field not only gets slow, but the code also gets big.

You might think why not make the field into a property and put the prequel on the jitting of the getter and setter?, but that doesn't work because fields are variables and properties are not. We need to be able to pass static fields via ref and out, for instance, but you can't do that with a property. The field might be volatile, and cannot be a property. And so on.

It would be nice to be able to avoid these costs on field accesses.

Static classes without an explicit cctor but with a compiler-generated implicit cctor to initialize the static fields get "relaxed" semantics where the jitter merely guarantees that the cctor is called at some point before a field is accessed. Your program uses these relaxed semantics.

In the first version, with a field access, the jitter knows from its analysis of the method that a static field might be accessed. (Why "might"? As before, the access could be under an if.) The jitter is allowed to run the cctor at any time before the first access, so what it does is it makes a note that says when Main is jitted, check to see if the Test1 cctor has been run, and if not, run it.

If Main is called a second time, hey, it's only jitted once. So again, the cost of the check is only borne on the first call. (Of course Main is only ever called once in most programs, but you can write a recursive Main if you're into that sort of thing.)

In your second program there is no field access. The jitter could also reason that a static method is accessed, and that the cctor could be run at jit time for Main. It does not. Why not? I don't know; you'd have to ask the jitter team about that. But the point is that the jitter is entirely within its rights to use a heuristic to decide whether or not to run the cctor at jit time, and it does so.

The jitter is also within its rights to use a heuristic to decide whether or not a call to a static method that touches no field triggers the cctor; in this case apparently it does so, unnecessarily.

Your question seems to be "what are these heuristics?" and the answer is... well, I don't definitively know what the answer is, and it is an implementation detail of the runtime subject to change at its whim. You've seen in this answer what some good guesses are about the nature of those heuristics:

  • Check to see if the cctor of T needs running when any static method of T is jitted
  • Check to see if the cctor of T needs running when any method that accesses a static field of T is jitted

Those heuristics would fulfill the requirements of the relaxed semantics, and would avoid emitting all checks at call sites, and would still ensure reasonable behaviour.

But you can't rely on those guesses. All you can rely on is that the cctor will get run at some point before the first field access, and that's what you're getting. Whether or not there is a field access in a particular method is plainly a part of that heuristic, but those heuristics could change.

like image 149
Eric Lippert Avatar answered Nov 18 '22 03:11

Eric Lippert