Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the purpose of the 'short' notation of IL?

Tags:

c#

.net

opcode

il

Every time I bumb into them in IL: br_S, ldc_i4_S, ldarg_S, etc, etc... So I just have to ask the question:

I mean... If you're JIT'ing a language from IL to native assembler, it shouldn't really matter in terms of performance, right? So what's the purpose of having these 'short-hand' notations? Is it just for having fewer bytes in the IL binaries (e.g. as a compression mechanism) or is there some other reason?

And if it's just as a compression mechanism, why not use a compression algorithm like deflate?

like image 356
atlaste Avatar asked May 30 '15 19:05

atlaste


3 Answers

Sure, it is a micro-optimization. But the Golden Rule of micro-optimizing applies strongly here, they turn macro when you can apply the optimization over and over again. Which is certainly the case here, method bodies are small so the vast majority of branches are short ones, methods have a limited number of arguments and local variables so a single byte is enough to address them, the constants 0 through 9 very often appear in a real program.

Add them all up and you have a macro-optimization on a large assembly, many kilobytes shaved-off. Which does matter at runtime, that's IL that doesn't have to page-faulted into RAM. Warm-start time in a jitted program is always an issue and has been attacked from every possible angle.

In general, the .NET Framework is relentlessly micro-optimized. Largely because Microsoft just can't assume that their code won't be on the critical path in a program.

like image 132
Hans Passant Avatar answered Nov 18 '22 18:11

Hans Passant


not a response: there is already a response, it postulates that without "short" opcodes, the assembly size would bloat. Is it true? Yes... but we have (I have, because I didn't have anything to do this evening) to demonstrate it :-)

How many bytes are "gained" through the use of "short" (in truth byte) values for OpCodes (like Bge_S vs Bge)? And from the use of OpCodes that include a "fixed" index/number? (like Ldarg_0 vs Ldarg or Ldc_I4_0 and Ldc_I4_M1 vs Ldc_I4)?

We could write a program to check it! :-) The result, for mscorlib 4.5.2:

4.5.2 or later

Assembly: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\mscorlib.dll

Size: 5217440

Of which resources: 948657

Skipped: 9496 methods

Parsed: 63720 methods

Gained 420786 bytes from short arguments

Gained 1062014 bytes from "fixed" number

So with "optimized opcodes" mscorlib is 5.2mb ... Without it would be 6.6mb just for the values sizes. 25% more! (and this ignoring that of 5.2mb, 0.9mb are of resources! So the percentage gain in opcode size is even greater)

(I'm using mscorlib because, while it isn't your typical assembly, it is surely quite full of code of every type)

The program: (I'm using Mono.Reflection):

public static void Main(string[] args)
{
    Console.WriteLine(CheckFor45DotVersion(GetReleaseKey()));

    string assemblyName = "mscorlib";
    Assembly assembly = Assembly.Load(assemblyName);

    Console.WriteLine("Assembly: {0}", assembly.Location);

    long fullSize = new FileInfo(assembly.Location).Length;

    Console.WriteLine("Size: {0}", fullSize);

    var allOpCodes = typeof(OpCodes).GetFields(BindingFlags.Static | BindingFlags.Public)
        .Where(x => x.FieldType == typeof(OpCode))
        .Select(x => (OpCode)x.GetValue(null));

    Dictionary<OpCode, int> opcodes = allOpCodes.ToDictionary(x => x, x => 0);

    long resourcesLength = 0;
    int skippedMethods = 0;
    int parsedMethods = 0;

    ParseAssembly(assembly, resource =>
    {
        ManifestResourceInfo info = assembly.GetManifestResourceInfo(resource);

        if (info.ResourceLocation.HasFlag(ResourceLocation.Embedded))
        {
            using (Stream stream = assembly.GetManifestResourceStream(resource))
            {
                resourcesLength += stream.Length;
            }
        }
    }, method =>
    {
        if (method.MethodImplementationFlags.HasFlag(MethodImplAttributes.InternalCall))
        {
            skippedMethods++;
            return;
        }

        if (method.Attributes.HasFlag(MethodAttributes.PinvokeImpl))
        {
            skippedMethods++;
            return;
        }

        if (method.IsAbstract)
        {
            skippedMethods++;
            return;
        }

        parsedMethods++;

        IList<Instruction> instructions = method.GetInstructions();

        foreach (Instruction instruction in instructions)
        {
            int num;
            opcodes.TryGetValue(instruction.OpCode, out num);
            opcodes[instruction.OpCode] = num + 1;
        }
    });

    Console.WriteLine("Of which resources: {0}", resourcesLength);

    Console.WriteLine();

    Console.WriteLine("Skipped: {0} methods", skippedMethods);
    Console.WriteLine("Parsed: {0} methods", parsedMethods);

    int gained = 0;
    int gainedFixedNumber = 0;

    // m1: Ldc_I4_M1
    var shortOpcodes = opcodes.Where(x =>
        x.Key.Name.EndsWith(".s") ||
        x.Key.Name.EndsWith(".m1") ||
            // .0 - .9
        x.Key.Name[x.Key.Name.Length - 2] == '.' && char.IsNumber(x.Key.Name[x.Key.Name.Length - 1]));

    foreach (var @short in shortOpcodes)
    {
        OpCode opCode = @short.Key;
        string name = opCode.Name.Remove(opCode.Name.LastIndexOf('.'));
        OpCode equivalentLong = opcodes.Keys.First(x => x.Name == name);

        int lengthShort = GetLength(opCode.OperandType);
        int lengthLong = GetLength(equivalentLong.OperandType);

        int gained2 = @short.Value * (lengthLong - lengthShort);

        if (opCode.Name.EndsWith(".s"))
        {
            gained += gained2;
        }
        else
        {
            gainedFixedNumber += gained2;
        }
    }

    Console.WriteLine();

    Console.WriteLine("Gained {0} bytes from short arguments", gained);
    Console.WriteLine("Gained {0} bytes from \"fixed\" number", gainedFixedNumber);
}

private static int GetLength(OperandType operandType)
{
    switch (operandType)
    {
        case OperandType.InlineNone:
            return 0;

        case OperandType.ShortInlineVar:
        case OperandType.ShortInlineI:
        case OperandType.ShortInlineBrTarget:
            return 1;

        case OperandType.InlineVar:
            return 2;

        case OperandType.InlineI:
        case OperandType.InlineBrTarget:
            return 4;

    }

    throw new NotSupportedException();
}

private static void ParseAssembly(Assembly assembly, Action<string> resourceAction, Action<MethodInfo> action)
{
    string[] names = assembly.GetManifestResourceNames();

    foreach (string name in names)
    {
        resourceAction(name);
    }

    Module[] modules = assembly.GetModules();

    foreach (Module module in modules)
    {
        ParseModule(module, action);
    }
}

private static void ParseModule(Module module, Action<MethodInfo> action)
{
    MethodInfo[] methods = module.GetMethods(BindingFlags.Instance | BindingFlags.Static | BindingFlags.Public | BindingFlags.NonPublic);

    foreach (MethodInfo method in methods)
    {
        action(method);
    }

    Type[] types = module.GetTypes();

    foreach (Type type in types)
    {
        ParseType(type, action);
    }
}

private static void ParseType(Type type, Action<MethodInfo> action)
{
    if (type.IsInterface)
    {
        return;
    }

    // delegate (in .NET all delegates are MulticastDelegate
    if (type != typeof(MulticastDelegate) && typeof(MulticastDelegate).IsAssignableFrom(type))
    {
        return;
    }

    MethodInfo[] methods = type.GetMethods(BindingFlags.Instance | BindingFlags.Static | BindingFlags.Public | BindingFlags.NonPublic);

    foreach (MethodInfo method in methods)
    {
        action(method);
    }

    Type[] nestedTypes = type.GetNestedTypes(BindingFlags.Public | BindingFlags.NonPublic);

    foreach (Type nestedType in nestedTypes)
    {
        ParseType(nestedType, action);
    }
}

// Adapted from https://msdn.microsoft.com/en-us/library/hh925568.aspx
private static int GetReleaseKey()
{
    using (RegistryKey ndpKey = RegistryKey.OpenBaseKey(RegistryHive.LocalMachine, RegistryView.Registry32).OpenSubKey("SOFTWARE\\Microsoft\\NET Framework Setup\\NDP\\v4\\Full\\"))
    {
        // .NET 4.0
        if (ndpKey == null)
        {
            return 0;
        }

        int releaseKey = (int)ndpKey.GetValue("Release");
        return releaseKey;
    }
}

// Checking the version using >= will enable forward compatibility,  
// however you should always compile your code on newer versions of 
// the framework to ensure your app works the same. 
private static string CheckFor45DotVersion(int releaseKey)
{
    if (releaseKey >= 393273)
    {
        return "4.6 RC or later";
    }
    if ((releaseKey >= 379893))
    {
        return "4.5.2 or later";
    }
    if ((releaseKey >= 378675))
    {
        return "4.5.1 or later";
    }
    if ((releaseKey >= 378389))
    {
        return "4.5 or later";
    }
    // This line should never execute. A non-null release key should mean 
    // that 4.5 or later is installed. 
    return "No 4.5 or later version detected";
}

Inspired by the code of xanatos, the same code using the raw IL bytes might give us an even better estimate of how much we can gain. I've divided the gains / losses into categories to get a good picture.

class Program
{
    internal class Calculator
    {
        static Calculator()
        {
            Register(OpCodes.Beq_S, 1, 3); // category 1: short-hand notations
            Register(OpCodes.Bge_S, 1, 3);
            Register(OpCodes.Bge_Un_S, 1, 3);
            Register(OpCodes.Bgt_S, 1, 3);
            Register(OpCodes.Bgt_Un_S, 1, 3);
            Register(OpCodes.Ble_S, 1, 3);
            Register(OpCodes.Ble_Un_S, 1, 3);
            Register(OpCodes.Blt_S, 1, 3);
            Register(OpCodes.Blt_Un_S, 1, 3);
            Register(OpCodes.Bne_Un_S, 1, 3);
            Register(OpCodes.Br_S, 1, 3);
            Register(OpCodes.Brfalse_S, 1, 3);
            Register(OpCodes.Brtrue_S, 1, 3);
            Register(OpCodes.Conv_I, 2, 4); // category 2: types can be generalized
            Register(OpCodes.Conv_I1, 2, 4);
            Register(OpCodes.Conv_I2, 2, 4);
            Register(OpCodes.Conv_I4, 2, 4);
            Register(OpCodes.Conv_I8, 2, 4);
            Register(OpCodes.Conv_Ovf_I, 2, 4);
            Register(OpCodes.Conv_Ovf_I_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_I1, 2, 4);
            Register(OpCodes.Conv_Ovf_I1_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_I2, 2, 4);
            Register(OpCodes.Conv_Ovf_I2_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_I4, 2, 4);
            Register(OpCodes.Conv_Ovf_I4_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_I8, 2, 4);
            Register(OpCodes.Conv_Ovf_I8_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_U, 2, 4);
            Register(OpCodes.Conv_Ovf_U_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_U1, 2, 4);
            Register(OpCodes.Conv_Ovf_U1_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_U2, 2, 4);
            Register(OpCodes.Conv_Ovf_U2_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_U4, 2, 4);
            Register(OpCodes.Conv_Ovf_U4_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_U8, 2, 4);
            Register(OpCodes.Conv_Ovf_U8_Un, 2, 4);
            Register(OpCodes.Conv_R_Un, 2, 4);
            Register(OpCodes.Conv_R4, 2, 4);
            Register(OpCodes.Conv_R8, 2, 4);
            Register(OpCodes.Conv_U, 2, 4);
            Register(OpCodes.Conv_U1, 2, 4);
            Register(OpCodes.Conv_U2, 2, 4);
            Register(OpCodes.Conv_U4, 2, 4);
            Register(OpCodes.Conv_U8, 2, 4);
            Register(OpCodes.Ldarg_0, 3, 2);
            Register(OpCodes.Ldarg_1, 3, 2);
            Register(OpCodes.Ldarg_2, 3, 2);
            Register(OpCodes.Ldarg_3, 3, 2);
            Register(OpCodes.Ldarg_S, 1, 2);
            Register(OpCodes.Ldarga, 3, 2);
            Register(OpCodes.Ldarga_S, 1, 2);
            Register(OpCodes.Ldc_I4_0, 3, 2); // category 3: small value loads
            Register(OpCodes.Ldc_I4_1, 3, 2);
            Register(OpCodes.Ldc_I4_2, 3, 2);
            Register(OpCodes.Ldc_I4_3, 3, 2);
            Register(OpCodes.Ldc_I4_4, 3, 2);
            Register(OpCodes.Ldc_I4_5, 3, 2);
            Register(OpCodes.Ldc_I4_6, 3, 2);
            Register(OpCodes.Ldc_I4_7, 3, 2);
            Register(OpCodes.Ldc_I4_8, 3, 2);
            Register(OpCodes.Ldc_I4_M1, 3, 2);
            Register(OpCodes.Ldc_I4_S, 1, 3);
            Register(OpCodes.Ldelem_I, 2, 4);
            Register(OpCodes.Ldelem_I1, 2, 4);
            Register(OpCodes.Ldelem_I2, 2, 4);
            Register(OpCodes.Ldelem_I4, 2, 4);
            Register(OpCodes.Ldelem_I8, 2, 4);
            Register(OpCodes.Ldelem_R4, 2, 4);
            Register(OpCodes.Ldelem_R8, 2, 4);
            Register(OpCodes.Ldelem_Ref, 2, 4);
            Register(OpCodes.Ldelem_U1, 2, 4);
            Register(OpCodes.Ldelem_U2, 2, 4);
            Register(OpCodes.Ldelem_U4, 2, 4);
            Register(OpCodes.Ldind_I, 2, 4);
            Register(OpCodes.Ldind_I1, 2, 4);
            Register(OpCodes.Ldind_I2, 2, 4);
            Register(OpCodes.Ldind_I4, 2, 4);
            Register(OpCodes.Ldind_I8, 2, 4);
            Register(OpCodes.Ldind_R4, 2, 4);
            Register(OpCodes.Ldind_R8, 2, 4);
            Register(OpCodes.Ldind_Ref, 2, 4);
            Register(OpCodes.Ldind_U1, 2, 4);
            Register(OpCodes.Ldind_U2, 2, 4);
            Register(OpCodes.Ldind_U4, 2, 4);
            Register(OpCodes.Ldloc_0, 3, 1);
            Register(OpCodes.Ldloc_1, 3, 1);
            Register(OpCodes.Ldloc_2, 3, 1);
            Register(OpCodes.Ldloc_3, 3, 1);
            Register(OpCodes.Ldloc_S, 1, 1);
            Register(OpCodes.Ldloca_S, 1, 1);
            Register(OpCodes.Leave_S, 1, 3);
            Register(OpCodes.Starg_S, 1, 1);
            Register(OpCodes.Stelem_I, 2, 4);
            Register(OpCodes.Stelem_I1, 2, 4);
            Register(OpCodes.Stelem_I2, 2, 4);
            Register(OpCodes.Stelem_I4, 2, 4);
            Register(OpCodes.Stelem_I8, 2, 4);
            Register(OpCodes.Stelem_R4, 2, 4);
            Register(OpCodes.Stelem_R8, 2, 4);
            Register(OpCodes.Stelem_Ref, 2, 4);
            Register(OpCodes.Stind_I, 2, 4);
            Register(OpCodes.Stind_I1, 2, 4);
            Register(OpCodes.Stind_I2, 2, 4);
            Register(OpCodes.Stind_I4, 2, 4);
            Register(OpCodes.Stind_I8, 2, 4);
            Register(OpCodes.Stind_R4, 2, 4);
            Register(OpCodes.Stind_R8, 2, 4);
            Register(OpCodes.Stind_Ref, 2, 4);
            Register(OpCodes.Stloc_0, 3, 1);
            Register(OpCodes.Stloc_1, 3, 1);
            Register(OpCodes.Stloc_2, 3, 1);
            Register(OpCodes.Stloc_3, 3, 1);
            Register(OpCodes.Stloc_S, 1, 1);
        }

        private Calculator() { }

        private static void Register(OpCode opCode, int category, int delta)
        {
            dict[opCode] = new KeyValuePair<int, int>(category, delta);
        }

        private static Dictionary<OpCode, KeyValuePair<int, int>> dict = new Dictionary<OpCode, KeyValuePair<int, int>>();

        public static void Update(int[] data, OpCode opcode)
        {
            KeyValuePair<int, int> kv;
            if (dict.TryGetValue(opcode, out kv))
            {
                data[kv.Key] += kv.Value;
            }
        }
    }

    internal class Decompiler
    {
        public Decompiler() { }

        static Decompiler()
        {
            singleByteOpcodes = new OpCode[0x100];
            multiByteOpcodes = new OpCode[0x100];
            FieldInfo[] infoArray1 = typeof(OpCodes).GetFields();
            for (int num1 = 0; num1 < infoArray1.Length; num1++)
            {
                FieldInfo info1 = infoArray1[num1];
                if (info1.FieldType == typeof(OpCode))
                {
                    OpCode code1 = (OpCode)info1.GetValue(null);
                    ushort num2 = (ushort)code1.Value;
                    if (num2 < 0x100)
                    {
                        singleByteOpcodes[(int)num2] = code1;
                    }
                    else
                    {
                        if ((num2 & 0xff00) != 0xfe00)
                        {
                            throw new Exception("Invalid opcode: " + num2.ToString());
                        }
                        multiByteOpcodes[num2 & 0xff] = code1;
                    }
                }
            }
        }

        private static OpCode[] singleByteOpcodes;
        private static OpCode[] multiByteOpcodes;

        public int[] Delta = new int[5];

        public void Decompile(MethodBase mi, byte[] ildata)
        {
            Module module = mi.Module;

            int position = 0;
            while (position < ildata.Length)
            {
                OpCode code = OpCodes.Nop;

                ushort b = ildata[position++];
                if (b != 0xfe)
                {
                    code = singleByteOpcodes[b];
                }
                else
                {
                    b = ildata[position++];
                    code = multiByteOpcodes[b];
                    b |= (ushort)(0xfe00);
                    Delta[4]++;
                }

                switch (code.OperandType)
                {
                    case OperandType.InlineBrTarget:
                        position += 4;
                        break;
                    case OperandType.InlineField:
                        position += 4;
                        break;
                    case OperandType.InlineI:
                        position += 4;
                        break;
                    case OperandType.InlineI8:
                        position += 8;
                        break;
                    case OperandType.InlineMethod:
                        position += 4;
                        break;
                    case OperandType.InlineNone:
                        break;
                    case OperandType.InlineR:
                        position += 8;
                        break;
                    case OperandType.InlineSig:
                        position += 4;
                        break;
                    case OperandType.InlineString:
                        position += 4;
                        break;
                    case OperandType.InlineSwitch:
                        int count = BitConverter.ToInt32(ildata, position);
                        position += count * 4 + 4;
                        break;
                    case OperandType.InlineTok:
                    case OperandType.InlineType:
                        position += 4;
                        break;
                    case OperandType.InlineVar:
                        position += 2;
                        break;
                    case OperandType.ShortInlineBrTarget:
                        position += 1;
                        break;
                    case OperandType.ShortInlineI:
                        position += 1;
                        break;
                    case OperandType.ShortInlineR:
                        position += 4;
                        break;
                    case OperandType.ShortInlineVar:
                        position += 1;
                        break;

                    default:
                        throw new Exception("Unknown instruction operand; cannot continue. Operand type: " + code.OperandType);
                }

                Calculator.Update(Delta, code);
            }
        }
    }

    static void Main(string[] args)
    {
        string assemblyName = "mscorlib";
        Assembly assembly = Assembly.Load(assemblyName);

        int skippedMethods = 0;
        int parsedMethods = 0;
        Decompiler decompiler = new Decompiler();

        long totalbytes = 0;

        Console.WriteLine("Assembly: {0}", assembly.Location);
        foreach (var type in assembly.GetTypes())
        {
            foreach (var method in type.GetMethods(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Static | BindingFlags.Instance))
            {
                if (method.MethodImplementationFlags.HasFlag(MethodImplAttributes.InternalCall) ||
                    method.Attributes.HasFlag(MethodAttributes.PinvokeImpl) ||
                    method.IsAbstract)
                {
                    skippedMethods++;
                }
                else
                {
                    var body = method.GetMethodBody();
                    byte[] bytes;
                    if (body != null && (bytes = body.GetILAsByteArray()) != null)
                    {
                        decompiler.Decompile(method, bytes);
                        ++parsedMethods;

                        totalbytes += bytes.Length;
                    }
                    else 
                    {
                        skippedMethods++;
                    }
                }
            }
        }

        var delta = decompiler.Delta;

        Console.WriteLine("{0} methods parsed, {1} methods skipped", parsedMethods, skippedMethods);
        Console.WriteLine("- {0} bytes total", totalbytes);
        Console.WriteLine("- {0} bytes gained from generalizing short-hand notations", delta[1]);
        Console.WriteLine("- {0} bytes gained from generalizing type notations", delta[2]);
        Console.WriteLine("- {0} bytes gained from generalizing loads notations", delta[3]);
        Console.WriteLine("- {0} bytes lost from multi-byte opcodes", delta[4]);

        Console.ReadLine();
    }
}

The results:

Assembly: C:\Windows\Microsoft.NET\Framework\v4.0.30319\mscorlib.dll
56117 methods parsed, 10605 methods skipped
- 2489275 bytes total
- 361193 bytes gained from generalizing short-hand notations
- 126724 bytes gained from generalizing type notations
- 618858 bytes gained from generalizing loads notations
- 9447 bytes lost from multi-byte opcodes

What does this say? First, total is based on the actual IL bytes of the methods. Short-hand notation gains are the number of bytes gained from not using the _S opcodes. Type notation generalizations are opcodes that could have been generalized by using a type token instead of the opcode specialization (e.g. conversions like conv_i4). Load gains are bytes that would be gained from load specializations (e.g. ldc_i4_0). And finally because they now have too many opcodes to fit in a byte, IL also needs a few more bytes.

Total gain from all of this means that we would have a DLL of size 3586603 instead of 2489275 bytes - which totals a compression ratio of +/- 30%.

like image 5
xanatos Avatar answered Nov 18 '22 18:11

xanatos


End of the day short codes attempt to make 'smaller' code. Does not really make any semantic difference.

like image 2
leppie Avatar answered Nov 18 '22 16:11

leppie