Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does localloc break this CIL method?

I have the following piece of reduced CIL code.
When this CIL method is executed, an InvalidProgramException is being thrown by the CLR:

  .method assembly hidebysig specialname rtspecialname 
          instance void  .ctor(class [mscorlib]System.Collections.Generic.IEnumerable`1<class System.Windows.Input.StylusDeviceBase> styluses) cil managed
  {
    .locals init (class [mscorlib]System.Collections.Generic.IEnumerator`1<class System.Windows.Input.StylusDeviceBase> V_0,
         class System.Windows.Input.StylusDeviceBase V_1)

    ldc.i4.8   // These instructions cause CIL to break 
    conv.u     //
    localloc   //
    pop        //

    ldarg.0
    newobj instance void class [mscorlib]System.Collections.Generic.List`1<class System.Windows.Input.StylusDevice>::.ctor()
    call   instance void class [mscorlib]System.Collections.ObjectModel.ReadOnlyCollection`1<class System.Windows.Input.StylusDevice>::.ctor(class [mscorlib]System.Collections.Generic.IList`1<!0>)
    ldarg.1
    callvirt instance class [mscorlib]System.Collections.Generic.IEnumerator`1<!0> class [mscorlib]System.Collections.Generic.IEnumerable`1<class System.Windows.Input.StylusDeviceBase>::GetEnumerator()
    stloc.0

    .try
    {
       leave.s IL_0040
    }
    finally
    {
       endfinally
    }   

    IL_0040: ret
  } // end of method StylusDeviceCollection::.ctor

My question is, why is this CIL code invalid?

Several obervations:
- If localloc is removed, the code runs fine. To my knowledge, localloc replaces the parameter size on the stack with an address, so the stack remains balanced, AFAICT.
- If the try and finally blocks are removed, the code runs fine.
- If the first block of instructions containing localloc is moved to after the try-finally block, the code runs fine.

So it seems like something in the combination of localloc and the try-finally.

Some background:

I got to this point after the InvalidProgramException was thrown for the original method, due to some instrumentation made in runtime. My approach for debugging this, for figuring out what is wrong with the instrumentation, is:

  • Disassembling the faulty DLL with ildasm
  • Applying the instrumentation code to the crashing method
  • Recreating the DLL from the modified IL with ilasm
  • Running the program again, and verifying it crashes
  • Keep reducing the IL code of the crashing method gradualy, down to the minimal scenario that causes the problem (and trying not to introduce bugs along the way...)

Unfortunately, peverify.exe /IL did not indicate any error. I tried to console the ECMA spec and Serge Lidin's Expert .NET IL book, but couldn't figure out what is it that goes wrong.

Is there something basic I am missing?

Edit:

I slightly updated the IL code in question, to make it more complete (without modifying instructions). The second block of instructions, including ldarg, newobj, etc., is taken as is from the working code - the original method code.

What's weird to me is, by removing either localloc or .try-finally, the code works - but none of these, to my knowledge, should change the balancing of the stack, compared to if they're present in the code.

Here's the IL code decompiled into C# with ILSpy:

internal unsafe StylusDeviceCollection(IEnumerable<StylusDeviceBase> styluses)
{
    IntPtr arg_04_0 = stackalloc byte[(UIntPtr)8];
    base..ctor(new List<StylusDevice>());
    IEnumerator<StylusDeviceBase> enumerator = styluses.GetEnumerator();
    try
    {
    }
    finally
    {
    }
}

Edit 2:

More observations:
- Taking the localloc block of IL code, and moving it to the end of the function, code runs fine - so it seems that code on its own is OK.
- The issue does not reproduce when pasting similar IL code into a hello world test function.

I'm very puzzled...

I wish there was a way to get more information from the InvalidProgramException. It seems that the CLR doesn't attach the exact failure reason to the exception object. I also thought on debugging with CoreCLR debug build, but unforunately the program I'm debugging is not compatible with it...

like image 411
valiano Avatar asked Oct 29 '22 00:10

valiano


1 Answers

Sadly, it seems I hit a CLR bug...

Everything is working when using the legacy JIT Compiler:

set COMPLUS_useLegacyJit=1

I wasn't able to isolate a specific RyuJit setting which may be causing this. I followed the recommendation in this article:
https://github.com/Microsoft/dotnet/blob/master/Documentation/testing-with-ryujit.md

Thanks to everyone who helped!

Aftermath:

Sometime after I came across the legacy JIT workaround, I realized that the issue only manifests when instrumenting localloc (which is a non verifiable opcode) into a security critical method called from a security transparent method. Only in this scenario, RyuJit would throw an InvalidProgramException, while Legacy JIT won't.

In my reproduction, I disassembled and reassembled the DLL in question and modified the function code directly, keeping security attributes intact - specifically the AllowPartiallyTrustedCallers assembly attribute - which explains why the issue wasn't reproduced with an isolated example.

It might be that in RyuJIT there's some security hardening compared to Legacy JIT which surfaces this issue, but still, the fact that localloc will cause the CLR to throw an InvalidProgramException depdending in the presence of try-catch and its relative location to the localloc, does seem like a subtle bug.

Running SecAnnotate.exe (.NET Security Annotator tool) on the failing DLL was helpful in revealing the security issues between function calls.

More on Security-Transparent Code:
https://learn.microsoft.com/en-us/dotnet/framework/misc/security-transparent-code

like image 199
valiano Avatar answered Nov 15 '22 05:11

valiano