Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why casting double to double emits conv.r8 IL instruction

Is there any reason for the C# compiler to emit a conv.r8 when casting from double -> double ?

This looks to be completely unnecessary (casting from int -> int, char -> char, etc) does not emit equivalent conversion instructions (as you can see in generated IL for the I2I() method).

class Foo
{
    double D2D(double d) => (double) d;
    int I2I(int i) => (int) i;
}

results in the IL of:

.class private auto ansi '<Module>'
{
} // end of class <Module>

.class private auto ansi beforefieldinit Foo
    extends [System.Private.CoreLib]System.Object
{
    // Methods
    .method private hidebysig 
        instance float64 D2D (
            float64 d
        ) cil managed 
    {
        // Method begins at RVA 0x2050
        // Code size 3 (0x3)
        .maxstack 8

        IL_0000: ldarg.1
        IL_0001: conv.r8
        IL_0002: ret
    } // end of method Foo::D2D

    .method private hidebysig 
        instance int32 I2I (
            int32 i
        ) cil managed 
    {
        // Method begins at RVA 0x2054
        // Code size 2 (0x2)
        .maxstack 8

        IL_0000: ldarg.1
        IL_0001: ret
    } // end of method Foo::I2I

    .method public hidebysig specialname rtspecialname 
        instance void .ctor () cil managed 
    {
        // Method begins at RVA 0x2057
        // Code size 8 (0x8)
        .maxstack 8

        IL_0000: ldarg.0
        IL_0001: call instance void [System.Private.CoreLib]System.Object::.ctor()
        IL_0006: nop
        IL_0007: ret
    } // end of method Foo::.ctor

} // end of class Foo

You can also play with the code above.

like image 757
Vagaus Avatar asked Mar 01 '23 11:03

Vagaus


1 Answers

The short version is that the intermediate representation of double/float in the CLI is intentionally unspecified. As such the compiler will always emit an explicit cast from double to double (or float to float) in case it would change the meaning of an expression.

It doesn't change the meaning in this case, but the compiler doesn't know that. (The JIT does though and will optimize it away.)


If you want all the gnitty gritty background details...

The ECMA-335 references below specifically come from the version with Microsoft-Specific implementation notes, which can be downloaded from here. (Note that since we're talking about IL I will be speaking from the perspective of the .NET Runtime's virtual machine, not from any particular processor architecture.)

The justification for why Roslyn emits this seemingly unnecessary instruction can be found in CodeGenerator.EmitIdentityConversion:

An explicit identity conversion from double to double or float to float on non-constants must stay as a conversion. An implicit identity conversion can be optimized away. Why? Because (double)d1 + d2 has different semantics than d1 + d2. The former rounds off to 64 bit precision; the latter is permitted to use higher precision math if d1 is enregistered.

(Emphasis and formatting mine.)

The important thing to note here is the "permitted to use higher precision math". To understand why this is we need to understand how the runtime represents different types at a low level. The virtual machine used by the .NET Runtime is stack-based, all intermediate values go onto what is called the evaluation stack. (Not to be confused with the processor's call stack, which may or may not be used for things on the evaluation stack at runtime.)

Partition I §12.3.2.1 The Evaluation Stack (pg 88) describes the evaluation stack, and lists what can be represented on the stack:

While the CLI, in general, supports the full set of types described in §12.1, the CLI treats the evaluation stack in a special way. While some JIT compilers might track the types on the stack in more detail, the CLI only requires that values be one of:

  • int64, an 8-byte signed integer
  • int32, a 4-byte signed integer
  • native int, a signed integer of either 4 or 8 bytes, whichever is more convenient for the target architecture
  • F, a floating point value (float32, float64, or other representation supported by the underlying hardware)
  • &, a managed pointer
  • O, an object reference
  • *, a “transient pointer,” which can be used only within the body of a single method, that points to a value known to be in unmanaged memory (see the CIL Instruction Set specification for more details. * types are generated internally within the CLI; they are not created by the user).
  • A user-defined value type

Of note is the only floating point type being the F type, which you'll notice is intentionally vague and does not represent a specific precision. (This is done to provide flexibility for runtime implementations since they have to run on many different processors, which may or may not prefer a specific level of precision for floating point operations.)

If we dig around a little further, this is also mentioned in Partition I §12.1.3 Handling of floating-point data types (pg 79):

Storage locations for floating-point numbers (statics, array elements, and fields of classes) are of fixed size. The supported storage sizes are float32 and float64. Everywhere else (on the evaluation stack, as arguments, as return types, and as local variables) floating-point numbers are represented using an internal floating-point type.

For the final piece of the puzzle, we need to understand the exact definition of conv.r8, which is defined in Partiion III §3.27 conv.<to type> - data conversion (pg 68):

conv.r8: Convert to float64, pushing F on stack.

and finally, the specifics of converting F to F are defined in Partition III §1.5 Table 8: Conversion Operations (pg 20): (Paraphrased)

If input (from the evaluation stack) is F and convert-to is "All float types": Change precision³

³Converts from the current precision available on the evaluation stack to the precision specified by the instruction. If the stack has more precision than the output size the conversion is performed using the IEC 60559:1989 “round-to-nearest” mode to compute the low order bit of the result.

So in this context you should read conv.r8 as "Convert from unspecified floating-point format to double" rather than "Convert from double to double". (Although in this case, we can be pretty sure that F on the evaluation stack is already double precision since it's from a double argument.)


So in summary:

  • The .NET Runtime has a float64 type, but only for storage purposes.
  • For evaluation purposes (and passing arguments), a precision-unspecified F type is must be used instead.
  • This means that sometimes an "unnecessary" explicit cast to double is actually changing the precision of an expression.
  • The C# compiler doesn't know whether or not it will matter so it always emits the conversion from F to float64. (However the JIT does, and in this case will optimize away the cast at runtime.)
like image 58
Pathogen David Avatar answered Mar 11 '23 14:03

Pathogen David