Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# IL code modification - keep stack intact

This question is about static stack analysis of custom C# IL code and how to design the opcodes to satisfy the compiler.

I have code that modifies existing C# methods by appending my own code to it. To avoid that the original method returns before my code is executed, it replaces all RET opcodes with a BR endlabel and adds that label to the end of the original code. I then add more code there and finally a RET.

This all works fine in general but fails on certain methods. Here is a simple example:

public static string SomeMethod(int val)
{
    switch (val)
    {
        case 0:
            return "string1".convert();
        case 1:
            return "string2".convert();
        case 2:
            return "string3".convert();
        // ...
    }
    return "";
}

which is represented by this IL code:

.method public hidebysig static string SomeMethod(int32 val) cil managed
{
    .maxstack 1
    .locals val ([0] int32 num)
    L_0000: ldarg.0 
    L_0001: stloc.0 
    L_0002: ldloc.0 
    L_0003: switch (L_002e, L_004f, L_0044, ...)
    L_002c: br.s L_0091
    L_002e: ldstr "string1"
    L_0033: call string Foo::convert(string)
    L_0038: ret 
    L_0039: ldstr "string2"
    L_003e: call string Foo::convert(string)
    L_0043: ret 
    L_0044: ldstr "string3"
    L_0049: call string Foo::convert(string)
    L_004e: ret 
    ... 
    L_0091: ldstr ""
    L_0096: ret 
}

After my program modified it, the code looks like this:

.method public hidebysig static string SomeMethod(int32 val) cil managed
{
    .maxstack 1
    .locals val ([0] int32 num)
    L_0000: ldarg.0 
    L_0001: stloc.0 
    L_0002: ldloc.0 
    L_0003: switch (L_002e, L_004f, L_0044, ...)
    L_002c: br.s L_0091
    L_002e: ldstr "string1"
    L_0033: call string Foo::convert(string)
    L_0038: br L_009b // was ret 
    L_0039: ldstr "string2"
    L_003e: call string Foo::convert(string)
    L_0043: br L_009b // was ret 
    L_0044: ldstr "string3"
    L_0049: call string Foo::convert(string)
    L_004e: br L_009b // was ret 
    ... 
    L_0091: ldstr ""
    L_0096: br L_009b // was ret
    L_009b: my code here
    ...
    L_0200: ret
}

and I get an compile error:

Could not execute post-long-event action. Exception: System.TypeInitializationException: An exception was thrown by the type initializer for FooBar ---> System.InvalidProgramException: Invalid IL code in (wrapper dynamic-method) Foo:SomeMethod (int): IL_0000: ldnull

Is there any simple way to replace RETs in a generic way and keep the static analyzer happy?

like image 568
Andreas Pardeike Avatar asked Mar 12 '17 22:03

Andreas Pardeike


People also ask

What C is used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...

Is C language easy?

Compared to other languages—like Java, PHP, or C#—C is a relatively simple language to learn for anyone just starting to learn computer programming because of its limited number of keywords.

What is C language?

C is a high-level and general-purpose programming language that is ideal for developing firmware or portable applications. Originally intended for writing system software, C was developed at Bell Labs by Dennis Ritchie for the Unix Operating System in the early 1970s.


1 Answers

The problem turned out to be that all short jump instructions could possibly become too far away because inserting BR instead of RET increases opcode size.

I solved it by replacing all opcodes ending in "_S" with their corresponding long jump versions. For more details about this, have a look at this commit to my project: Fixed illegal short jumps

like image 85
Andreas Pardeike Avatar answered Sep 21 '22 18:09

Andreas Pardeike