Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing instructions in a method's MethodBody

(First of all, this is a very lengthy post, but don't worry: I've already implemented all of it, I'm just asking your opinion, or possible alternatives.)

I'm having trouble implementing the following; I'd appreciate some help:

  1. I get a Type as parameter.
  2. I define a subclass using reflection. Notice that I don't intend to modify the original type, but create a new one.
  3. I create a property per field of the original class, like so:

    public class OriginalClass {
        private int x;
    }
    
    
    public class Subclass : OriginalClass {
        private int x;
    
        public int X {
            get { return x; }
            set { x = value; }
        }
    
    }
    
  4. For every method of the superclass, I create an analogous method in the subclass. The method's body must be the same except that I replace the instructions ldfld x with callvirt this.get_X, that is, instead of reading from the field directly I call the get accessor.

I'm having trouble with step 4. I know you're not supposed to manipulate code like this, but I really need to.

Here's what I've tried:

Attempt #1: Use Mono.Cecil. This would allow me to parse the body of the method into human-readable Instructions, and easily replace instructions. However, the original type isn't in a .dll file, so I can't find a way to load it with Mono.Cecil. Writing the type to a .dll, then load it, then modify it and write the new type to disk (which I think is the way you create a type with Mono.Cecil), and then load it seems like a huge overhead.

Attempt #2: Use Mono.Reflection. This would also allow me to parse the body into Instructions, but then I have no support for replacing instructions. I've implemented a very ugly and inefficient solution using Mono.Reflection, but it doesn't yet support methods that contain try-catch statements (although I guess I can implement this) and I'm concerned that there may be other scenarios in which it won't work, since I'm using the ILGenerator in a somewhat unusual way. Also, it's very ugly ;). Here's what I've done:

private void TransformMethod(MethodInfo methodInfo) {

    // Create a method with the same signature.
    ParameterInfo[] paramList = methodInfo.GetParameters();
    Type[] args = new Type[paramList.Length];
    for (int i = 0; i < args.Length; i++) {
        args[i] = paramList[i].ParameterType;
    }
    MethodBuilder methodBuilder = typeBuilder.DefineMethod(
        methodInfo.Name, methodInfo.Attributes, methodInfo.ReturnType, args);
    ILGenerator ilGen = methodBuilder.GetILGenerator();

    // Declare the same local variables as in the original method.
    IList<LocalVariableInfo> locals = methodInfo.GetMethodBody().LocalVariables;
    foreach (LocalVariableInfo local in locals) {
        ilGen.DeclareLocal(local.LocalType);
    }

    // Get readable instructions.
    IList<Instruction> instructions = methodInfo.GetInstructions();

    // I first need to define labels for every instruction in case I
    // later find a jump to that instruction. Once the instruction has
    // been emitted I cannot label it, so I'll need to do it in advance.
    // Since I'm doing a first pass on the method's body anyway, I could
    // instead just create labels where they are truly needed, but for
    // now I'm using this quick fix.
    Dictionary<int, Label> labels = new Dictionary<int, Label>();
    foreach (Instruction instr in instructions) {
        labels[instr.Offset] = ilGen.DefineLabel();
    }

    foreach (Instruction instr in instructions) {

        // Mark this instruction with a label, in case there's a branch
        // instruction that jumps here.
        ilGen.MarkLabel(labels[instr.Offset]);

        // If this is the instruction that I want to replace (ldfld x)...
        if (instr.OpCode == OpCodes.Ldfld) {
            // ...get the get accessor for the accessed field (get_X())
            // (I have the accessors in a dictionary; this isn't relevant),
            MethodInfo safeReadAccessor = dataMembersSafeAccessors[((FieldInfo) instr.Operand).Name][0];
            // ...instead of emitting the original instruction (ldfld x),
            // emit a call to the get accessor,
            ilGen.Emit(OpCodes.Callvirt, safeReadAccessor);

        // Else (it's any other instruction), reemit the instruction, unaltered.
        } else {
            Reemit(instr, ilGen, labels);
        }

    }

}

And here comes the horrible, horrible Reemit method:

private void Reemit(Instruction instr, ILGenerator ilGen, Dictionary<int, Label> labels) {

    // If the instruction doesn't have an operand, emit the opcode and return.
    if (instr.Operand == null) {
        ilGen.Emit(instr.OpCode);
        return;
    }

    // Else (it has an operand)...

    // If it's a branch instruction, retrieve the corresponding label (to
    // which we want to jump), emit the instruction and return.
    if (instr.OpCode.FlowControl == FlowControl.Branch) {
        ilGen.Emit(instr.OpCode, labels[Int32.Parse(instr.Operand.ToString())]);
        return;
    }

    // Otherwise, simply emit the instruction. I need to use the right
    // Emit call, so I need to cast the operand to its type.
    Type operandType = instr.Operand.GetType();
    if (typeof(byte).IsAssignableFrom(operandType))
        ilGen.Emit(instr.OpCode, (byte) instr.Operand);
    else if (typeof(double).IsAssignableFrom(operandType))
        ilGen.Emit(instr.OpCode, (double) instr.Operand);
    else if (typeof(float).IsAssignableFrom(operandType))
        ilGen.Emit(instr.OpCode, (float) instr.Operand);
    else if (typeof(int).IsAssignableFrom(operandType))
        ilGen.Emit(instr.OpCode, (int) instr.Operand);
    ... // you get the idea. This is a pretty long method, all like this.
}

Branch instructions are a special case because instr.Operand is SByte, but Emit expects an operand of type Label. Hence the need for the Dictionary labels.

As you can see, this is pretty horrible. What's more, it doesn't work in all cases, for instance with methods that contain try-catch statements, since I haven't emitted them using methods BeginExceptionBlock, BeginCatchBlock, etc, of ILGenerator. This is getting complicated. I guess I can do it: MethodBody has a list of ExceptionHandlingClause that should contain the necessary information to do this. But I don't like this solution anyway, so I'll save this as a last-resort solution.

Attempt #3: Go bare-back and just copy the byte array returned by MethodBody.GetILAsByteArray(), since I only want to replace a single instruction for another single instruction of the same size that produces the exact same result: it loads the same type of object on the stack, etc. So there won't be any labels shifting and everything should work exactly the same. I've done this, replacing specific bytes of the array and then calling MethodBuilder.CreateMethodBody(byte[], int), but I still get the same error with exceptions, and I still need to declare the local variables or I'll get an error... even when I simply copy the method's body and don't change anything. So this is more efficient but I still have to take care of the exceptions, etc.

Sigh.

Here's the implementation of attempt #3, in case anyone is interested:

private void TransformMethod(MethodInfo methodInfo, Dictionary<string, MethodInfo[]> dataMembersSafeAccessors, ModuleBuilder moduleBuilder) {

    ParameterInfo[] paramList = methodInfo.GetParameters();
    Type[] args = new Type[paramList.Length];
    for (int i = 0; i < args.Length; i++) {
        args[i] = paramList[i].ParameterType;
    }
    MethodBuilder methodBuilder = typeBuilder.DefineMethod(
        methodInfo.Name, methodInfo.Attributes, methodInfo.ReturnType, args);

    ILGenerator ilGen = methodBuilder.GetILGenerator();

    IList<LocalVariableInfo> locals = methodInfo.GetMethodBody().LocalVariables;
    foreach (LocalVariableInfo local in locals) {
        ilGen.DeclareLocal(local.LocalType);
    }

    byte[] rawInstructions = methodInfo.GetMethodBody().GetILAsByteArray();
    IList<Instruction> instructions = methodInfo.GetInstructions();

    int k = 0;
    foreach (Instruction instr in instructions) {

        if (instr.OpCode == OpCodes.Ldfld) {

            MethodInfo safeReadAccessor = dataMembersSafeAccessors[((FieldInfo) instr.Operand).Name][0];

            // Copy the opcode: Callvirt.
            byte[] bytes = toByteArray(OpCodes.Callvirt.Value);
            for (int m = 0; m < OpCodes.Callvirt.Size; m++) {
                rawInstructions[k++] = bytes[put.Length - 1 - m];
            }

            // Copy the operand: the accessor's metadata token.
            bytes = toByteArray(moduleBuilder.GetMethodToken(safeReadAccessor).Token);
            for (int m = instr.Size - OpCodes.Ldfld.Size - 1; m >= 0; m--) {
                rawInstructions[k++] = bytes[m];
            }

        // Skip this instruction (do not replace it).
        } else {
            k += instr.Size;
        }

    }

    methodBuilder.CreateMethodBody(rawInstructions, rawInstructions.Length);

}


private static byte[] toByteArray(int intValue) {
    byte[] intBytes = BitConverter.GetBytes(intValue);
    if (BitConverter.IsLittleEndian)
        Array.Reverse(intBytes);
    return intBytes;
}



private static byte[] toByteArray(short shortValue) {
    byte[] intBytes = BitConverter.GetBytes(shortValue);
    if (BitConverter.IsLittleEndian)
        Array.Reverse(intBytes);
    return intBytes;
}

(I know it isn't pretty. Sorry. I put it quickly together to see if it would work.)

I don't have much hope, but can anyone suggest anything better than this?

Sorry about the extremely lengthy post, and thanks.


UPDATE #1: Aggh... I've just read this in the msdn documentation:

[The CreateMethodBody method] is currently not fully supported. The user cannot supply the location of token fix ups and exception handlers.

I should really read the documentation before trying anything. Some day I'll learn...

This means option #3 can't support try-catch statements, which makes it useless for me. Do I really have to use the horrible #2? :/ Help! :P


UPDATE #2: I've successfully implemented attempt #2 with support for exceptions. It's quite ugly, but it works. I'll post it here when I refine the code a bit. It's not a priority, so it may be a couple of weeks from now. Just letting you know in case someone is interested in this.

Thanks for your suggestions.

like image 333
Alix Avatar asked May 10 '10 14:05

Alix


1 Answers

I am trying to do a very similar thing. I have already tried your #1 approach, and I agree, that creates a huge overhead (I haven't measured it exactly though).

There is a DynamicMethod class which is - according to MSDN - "Defines and represents a dynamic method that can be compiled, executed, and discarded. Discarded methods are available for garbage collection."

Performance wise it sounds good.

With the ILReader library I could convert normal MethodInfo to DynamicMethod. When you look into the ConvertFrom method of the DyanmicMethodHelper class of the ILReader library you can find the code we'd need:

byte[] code = body.GetILAsByteArray();
ILReader reader = new ILReader(method);
ILInfoGetTokenVisitor visitor = new ILInfoGetTokenVisitor(ilInfo, code);
reader.Accept(visitor);

ilInfo.SetCode(code, body.MaxStackSize);

Theoretically this let's us modify the code of an existing method and run it as a dynamic method.

My only problem now is that Mono.Cecil does not allow us to save the bytecode of a method (at least I could not find the way to do it). When you download the Mono.Cecil source code it has a CodeWriter class to accomplish the task, but it is not public.

Other problem I have with this approach is that MethodInfo -> DynamicMethod transformation works only with static methods with ILReader. But this can be worked around.

The performance of the invocation depends on the method I used. I got following results after calling short method 10'000'000 times:

  • Reflection.Invoke - 14 sec
  • DynamicMethod.Invoke - 26 sec
  • DynamicMethod with delegates - 9 sec

Next thing I'm going to try is:

  1. load original method with Cecil
  2. modify the code in Cecil
  3. strip off of the unmodified code from the assembly
  4. save the assembly as MemoryStream instead of File
  5. load the new assembly (from memory) with Reflection
  6. call the method with reflection invoke if its a one-time call
  7. generate DynamicMethod's delegates and store them if I want to call that method regularly
  8. try to find out if I can unload the not necessary assemblies from memory (free up both MemoryStream and run-time assembly representation)

It sounds like a lot of work and it might not work, we'll see :)

I hope it helps, let me know what you think.

like image 87
Andras Avatar answered Oct 06 '22 05:10

Andras