Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why c# compiler in some cases emits newobj/stobj rather than 'call instance .ctor' for struct initialization

Tags:

c#

.net

il

here some test program in c#:

using System;


struct Foo {
    int x;
    public Foo(int x) {
        this.x = x;
    }
    public override string ToString() {
        return x.ToString();
    }
}

class Program {
    static void PrintFoo(ref Foo foo) {
        Console.WriteLine(foo);
    }
    
    static void Main(string[] args) {
        Foo foo1 = new Foo(10);
        Foo foo2 = new Foo(20);
        
        Console.WriteLine(foo1);
        PrintFoo(ref foo2);
    }
}

and here disassembled compiled version of method Main:

.method private hidebysig static void Main (string[] args) cil managed {
    // Method begins at RVA 0x2078
    // Code size 42 (0x2a)
    .maxstack 2
    .entrypoint
    .locals init (
        [0] valuetype Foo foo1,
        [1] valuetype Foo foo2
    )

    IL_0000: ldloca.s foo1
    IL_0002: ldc.i4.s 10
    IL_0004: call instance void Foo::.ctor(int32)
    IL_0009: ldloca.s foo2
    IL_000b: ldc.i4.s 20
    IL_000d: newobj instance void Foo::.ctor(int32)
    IL_0012: stobj Foo
    IL_0017: ldloc.0
    IL_0018: box Foo
    IL_001d: call void [mscorlib]System.Console::WriteLine(object)
    IL_0022: ldloca.s foo2
    IL_0024: call void Program::PrintFoo(valuetype Foo&)
    IL_0029: ret
} // end of method Program::Main

I don't get why newobj/stobj was emitted instead of simple call .ctor ? To make it more mysterious, newobj+stobj optimized by jit-compiler in 32 bit mode to one ctor call, but it doesn't in 64 bit mode...

UPDATE:

To clarify my confusion, below are my expectation.

value-type declaration expression like

Foo foo = new Foo(10)

should be compiled via

call instance void Foo::.ctor(int32)

value-type declaration expression like

Foo foo = default(Foo)

should be compiled via

initobj Foo

in my opinion temp variable in case of construction expression, or instance of default expression should be considered as target variable, as this could not follow to any dangerous behaviour

try{
    //foo invisible here
    ...
    Foo foo = new Foo(10);
    //we never get here, if something goes wrong
}catch(...){
    //foo invisible here
}finally{
    //foo invisible here
}

assignment expression like

foo = new Foo(10); // foo declared somewhere before

should be compiled to something like this:

.locals init (
    ...
    valuetype Foo __temp,
    ...
)

...
ldloca __temp
ldc.i4 10
call instance void Foo::.ctor(int32)
ldloc __temp
stloc foo
...

this the way i understand what C# specification says:

7.6.10.1 Object creation expressions

...

The run-time processing of an object-creation-expression of the form new T(A), where T is class-type or a struct-type and A is an optional argument-list, consists of the following steps:

...

If T is a struct-type:

  • An instance of type T is created by allocating a temporary local variable. Since an instance constructor of a struct-type is required to definitely assign a value to each field of the instance being created, no initialization of the temporary variable is necessary.

  • The instance constructor is invoked according to the rules of function member invocation (§7.5.4). A reference to the newly allocated instance is automatically passed to the instance constructor and the instance can be accessed from within that constructor as this.

i want to make emphasis on "allocating a temporary local variable". and in my understanding newobj instruction assume creation of object on heap...

Dependency of object creation from how it used make me down in this case, as foo1 and foo2 looks identically for me.

like image 484
andrey.ko Avatar asked Mar 04 '13 17:03

andrey.ko


People also ask

Why is C the best language?

The programs that you write in C compile and execute much faster than those written in other languages. This is because it does not have garbage collection and other such additional processing overheads. Hence, the language is faster as compared to most other programming languages.

Why is C used?

C is a general-purpose programming language and can efficiently work on enterprise applications, games, graphics, and applications requiring calculations, etc. C language has a rich library which provides a number of built-in functions. It also offers dynamic memory allocation.

Why should you learn C?

C is very fast in terms of execution time. Programs written and compiled in C execute much faster than compared to any other programming language. C programming language is very fast in terms of execution as it does not have any additional processing overheads such as garbage collection or preventing memory leaks etc.

Why is C language named so?

Quote from wikipedia: "A successor to the programming language B, C was originally developed at Bell Labs by Dennis Ritchie between 1972 and 1973 to construct utilities running on Unix." The creators want that everyone "see" his language. So he named it "C". C is about the tone C.


1 Answers

First off, you should read my article on this subject. It does not address your specific scenario, but it has some good background information:

https://ericlippert.com/2010/10/11/debunking-another-myth-about-value-types/

OK, so now that you've read that you know that the C# specification states that constructing an instance of a struct has these semantics:

  • Create a temporary variable to store the struct value, initialized to the default value of the struct.
  • Pass a reference to that temporary variable as the "this" of the constructor

So when you say:

Foo foo = new Foo(123);

That is equivalent to:

Foo foo;
Foo temp = default(Foo);
Foo.ctor(ref temp, 123); // "this" is a ref to a variable in a struct.
foo1 = temp;

Now, you might ask why go through all the trouble of allocating a temporary when we already have a variable foo right there that could be this:

Foo foo = default(Foo);
Foo.ctor(ref foo, 123);

That optimization is called copy elision. The C# compiler and/or the jitter are permitted to perform a copy elision when they determine using their heuristics that doing so is always invisible. There are rare circumstances in which a copy elision can cause an observable change in the program, and in those cases the optimization must not be used. For example, suppose we have a pair-of-ints struct:

Pair p = default(Pair);
try { p = new Pair(10, 20); } catch {}
Console.WriteLine(p.First);
Console.WriteLine(p.Second);

We expect that p here is either (0, 0) or (10, 20), never (10, 0) or (0, 20), even if the ctor throws halfway through. That is, either the assignment to p was of the completely constructed value, or no modification was made to p at all. The copy elision cannot be performed here; we have to make a temporary, pass the temporary to the ctor, and then copy the temporary to p.

Similarly, suppose we had this insanity:

Pair p = default(Pair);
p = new Pair(10, 20, ref p);
Console.WriteLine(p.First);
Console.WriteLine(p.Second);

If the C# compiler performs the copy elision then this and ref p are both aliases to p, which is observably different than if this is an alias to a temporary! The ctor could observe that changes to this cause changes to ref p if they alias the same variable, but would not observe that if they aliased different variables.

The C# compiler heuristic is deciding to do the copy elision on foo1 but not foo2 in your program. It is seeing that there is a ref foo2 in your method and deciding right there to give up. It could do a more sophisticated analysis to determine that it is not in one of these crazy aliasing situations, but it doesn't. The cheap and easy thing to do is to just skip the optimization if there is any chance, however remote, that there could be an aliasing situation that makes the elision visible. It generates the newobj code and let the jitter decide whether it wants to make the elision.

As for the jitter: the 64 bit and 32 bit jitters have completely different optimizers. Apparently one of them is deciding that it can introduce the copy elision that the C# compiler did not, and the other one is not.

like image 53
Eric Lippert Avatar answered Sep 19 '22 14:09

Eric Lippert