Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why aren't simple properties optimized to fields?

Tags:

c#

.net

sealed class A
{
    public int X;
    public int Y { get; set; }
}

If I create a new instance of A it takes me about 550ms to access Y 100,000,000 times, while it takes about 250ms to access X. I'm running it as a release build and it's still much slower for the property. Why doesn't .NET optimize Y to a field?

Edit:

    A t = new A();
    t.Y = 50;
    t.X = 50;

    Int64 y = 0;

    Stopwatch sw = new Stopwatch();
    sw.Start();

    for (int i = 0; i < 100000000; i++)
        y += t.Y;

    sw.Stop();

That's my code I'm using to test, and I'm changing t.Y to t.X to test X instead. Also I'm in release build.

like image 431
Levi H Avatar asked Mar 16 '13 20:03

Levi H


2 Answers

for (int i = 0; i < 100000000; i++)
    y += t.X;

This is very difficult code to profile. You can see that when looking at the generated machine code with Debug + Windows + Disassembly. The x64 code looks like this:

0000005a  xor         r11d,r11d                           ; i = 0
0000005d  mov         eax,dword ptr [rbx+0Ch]             ; read t.X
00000060  add         r11d,4                              ; i += 4
00000064  cmp         r11d,5F5E100h                       ; test i < 100000000
0000006b  jl          0000000000000060                    ; for (;;)

This is heavily optimized code, note how the += operator completely disappeared. You allowed this to happen because you made a mistake in your benchmark, you are not using the computed value of y at all. The jitter knows this so it simply removed the pointless addition. The increment by 4 needs an explanation as well, this is a side-effect of a loop unrolling optimization. You'll see it used later.

So you must make a change to your benchmark to make it realistic, add this line at the end:

sw.Stop();
Console.WriteLine("{0} msec, {1}", sw.ElapsesMilliseconds, y);

Which forces the value of y to be computed. It now looks completely different:

0000005d  xor         ebp,ebp                             ; y = 0
0000005f  mov         eax,dword ptr [rbx+0Ch]          
00000062  movsxd      rdx,eax                             ; rdx = t.X
00000065  nop         word ptr [rax+rax+00000000h]        ; align branch target
00000070  lea         rax,[rdx+rbp]                       ; y += t.X
00000074  lea         rcx,[rax+rdx]                       ; y += t.X
00000078  lea         rax,[rcx+rdx]                       ; y += t.X
0000007c  lea         rbp,[rax+rdx]                       ; y += t.X
00000080  add         r11d,4                              ; i += 4
00000084  cmp         r11d,5F5E100h                       ; test i < 100000000
0000008b  jl          0000000000000070                    ; for (;;)

Still very optimized code. The weirdo NOP instruction ensures that the jump at address 008b is efficient, jumping to an address that's aligned to 16 optimizes the instruction decoder unit in the processor. The LEA instruction is a classic trick to the let the address generation unit generate an addition, allowing the main ALUs to perform other work at the same time. No other work to be done here but could have if the loop body was more involved. And the loop was unrolled 4 times to avoid branch instructions.

Anyhoo, now you are actually measuring real code, instead of removed code. Result on my machine, repeating the test 10 times (important!):

y += t.X: 125 msec
y += t.Y: 125 msec

Exactly the same amount of time. Of course, it should be that way. You don't pay for a property.

The jitter does an excellent job of generating quality machine code. If you get a strange result then always check your test code first. It is the code most likely to have a mistake. Not the jitter, it has been thoroughly tested.

like image 187
Hans Passant Avatar answered Sep 27 '22 19:09

Hans Passant


X is just a simple field. However Y is a property with get and set accessors, named int get_Y() and void set_Y(int) internally. There's also a private backing field for Y with a special compiler-generated name, and the accessors access the backing field. Follwoing image shown in practice:

uplyZ.jpg

That's how the compiler should do it, as per the C# Language Specification. If the C# compiler emitted a field instead, it would violate the spec.

The runtime has to use the accessors generated by the compiler, of course. But the runtime may do tricks like inlining to avoid the extra call to the accessor. That's an optimization that might make property access just as fast as field access.

Hans Passant has emphasized that in fact the runtime will do the property access just as fast. Your original test code was flawed, the runtime could remove the read because the local variable it was assigned to, was never used. See Passant's answer in detailed.

Still, if you want a plain field, write one, and don't make an auto-property.

like image 44
Jeppe Stig Nielsen Avatar answered Sep 27 '22 19:09

Jeppe Stig Nielsen