Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does the compiler optimize ldc.i8 and not ldc.r8?

I'm wondering why this C# code

long b = 20;

is compiled to

ldc.i4.s 0x14
conv.i8

(Because it takes 3 bytes instead of the 9 required by ldc.i8 20. See this for more information.)

while this code

double a = 20;

is compiled to the 9-byte instruction

ldc.r8 20

instead of this 3-byte sequence

ldc.i4.s 0x14
conv.r8

(Using mono 4.8.)

Is this a missed opportunity or the cost of the conv.i8 outbalances the gain in code size ?

like image 339
Stephane Delcroix Avatar asked Dec 09 '16 10:12

Stephane Delcroix


2 Answers

Because float is not a smaller double, and integer is not a float (or vice versa).

All int values have a 1:1 mapping on a long value. The same simply isn't true for float and double - floating point operations are tricky that way. Not to mention that int-float conversions aren't free - unlike pushing a 1 byte value on the stack / in a register; look at the x86-64 code produced by both approaches, not just the IL code. Size of the IL code is not the only factor to consider in optimisation.

This is in contrast to decimal, which is actually a base-10 decimal number, rather than a base-2 decimal floating point number. There 20M maps perfectly to 20 and vice versa, so the compiler is free to emit this:

IL_0000:  ldc.i4.s    0A 
IL_0002:  newobj      System.Decimal..ctor

The same approach simply isn't safe (or cheap!) for binary floating point numbers.

You might think that the two approaches are necessarily safe, because it doesn't really matter whether we do a conversion from an integer literal ("a string") to a double value in compile-time, or whether we do it in IL. But this simply isn't the case, as a bit of specification diving unveils:

ECMA CLR spec, III.1.1.1:

Storage locations for floating-point numbers (statics, array elements, and fields of classes) are of fixed size. The supported storage sizes are float32 and float64. Everywhere else (on the evaluation stack, as arguments, as return types, and as local variables) floating-point numbers are represented using an internal floating-point type. In each such instance, the nominal type of the variable or expression is either float32 or float64, but its value might be represented internally with additional range and/or precision.

To keep things short, let's pretend float64 actually uses 4 binary digits, while the implementation defined floating type (F) uses 5 binary digits. We want to convert an integer literal that happens to have a binary representation that's more than four digits. Now compare how it's going to behave:

ldc.r8 0.1011E2 ; expanded to 0.10110E2
ldc.r8 0.1E2
mul             ; 0.10110E2 * 0.10000E2 == 0.10110E3

conv.r8 converts to the F, not float64. So we actually get:

ldc.i4.s theSameLiteral
conv.r8 ; converted to 0.10111E2
mul     ; 0.10111E2 * 0.10000E2 == 0.10111E3

Oops :)

Now, I'm pretty sure this isn't going to happen with an integer in the range of 0-255 on any reasonable platform. But since we're coding against the CLR specification, we can't make that assumption. The JIT compiler can, but that's too late. The language compiler may define the two to be equivalent, but the C# specification doesn't - a double local is considered a float64, not F. You can make your own language, if you so desire.

In any case, IL generators don't really optimise much. That's left to JIT compilation for the most part. If you want an optimised C#-IL compiler, write one - I doubt there's enough benefit to warrant the effort, especially if your only goal is to make the IL code smaller. Most IL binaries are already quite a bit smaller than the equivalent native code.

As for the actual code that runs, on my machine, both approaches result in exactly the same x86-64 assembly - load a double precision value from the data segment. The JIT can easily make this optimisation, since it knows what architecture the code is actually running on.

like image 177
Luaan Avatar answered Nov 01 '22 06:11

Luaan


I doubt you will get more satisfactory answer than "because noone thought it necessary to implement it."

The fact is, they could've made it this way, but as Eric Lippert has many times stated, features are chosen to be implemented rather than chosen not to be implemented. In this particular case, this feature's gain didn't outweigh the costs, e.g. additional testing, non-trivial conversion between int and float, while in the case of ldc.i4.s, it's not that much of a trouble. Also it's better not to bloat the jitter with more optimization rules.

As shown by the Roslyn source code, the conversion is done only for long. All in all, it's entirely possible to add this feature also for float or double, but it won't be much useful except when producing shorter CIL code (useful when inlining is needed), and when you want to use a float constant, you usually actually use a floating point number (i.e. not an integer).

like image 43
IS4 Avatar answered Nov 01 '22 08:11

IS4