Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Method is not inlined by the JIT compiler even though all criteria seems to be met

Background

While writing a class for parsing certain text, I needed the ability to get the line number of a specific character position (in other words, count all linebreaks that occur before that character).

Trying to find the most efficient code possible to achieve this I set up a couple of benchmarks, which revealed that Regex was the slowest method and that manually iterating the string was the fastest.

Following is my current approach (10k iterations: 278 ms):

private string text;

/// <summary>
/// Returns whether the specified character index is the end of a line.
/// </summary>
/// <param name="index">The index to check.</param>
/// <returns></returns>
private bool IsEndOfLine(int index)
{
    //Matches "\r" and "\n" (but not "\n" if it's preceded by "\r").
    char c = text[index];
    return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
}

/// <summary>
/// Returns the number of the line at the specified character index.
/// </summary>
/// <param name="index">The index of the character which's line number to get.</param>
/// <returns></returns>
public int GetLineNumber(int index)
{
    if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }

    int lineNumber = 1;
    int end = index;

    index = 0;
    while(index < end) {
        if(IsEndOfLine(index)) lineNumber++;
        index++;
    }

    return lineNumber;
}

However, while doing these benchmarks I remembered that method calls can sometimes be a bit expensive, so I decided to try moving the conditions from IsEndOfLine() directly into the if-statement inside GetLineNumber() as well.

Like I expected, this executes more than two times faster (10k iterations: 112 ms):

while(index < end) {
    char c = text[index];
    if(c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'))) lineNumber++;
    index++;
}

Problem

From what I've read the JIT compiler doesn't (or at least didn't) optimize IL code that is more than 32 bytes in size[1] unless [MethodImplAttribute(MethodImplOptions.AggressiveInlining)] is specified[2]. But despite applying this attribute to IsEndOfLine(), no inlining seems to occur.

Most of the talk I've been able to find about this are from older posts/articles. In the newest one ([2] from 2012) the author did apparently successfully inline a 34-byte function using MethodImplOptions.AggressiveInlining, implying that the flag allows larger IL code to be inlined if all other criteria are met.

Measuring my method's size using the following code revealed that it is 54 bytes long:

Console.WriteLine(this.GetType().GetMethod("IsEndOfLine").GetMethodBody().GetILAsByteArray().Length);

Using the Dissasembly window in VS 2019 shows the following Assembly code for IsEndOfLine() (with C# source code turned on in the Viewing Options):

(Configuration: Release (x86), disabled Just My Code and Suppress JIT optimization on module load)

--- [PATH REMOVED]\Performance Test - Find text line number\TextParser.cs 
    28:             char c = text[index];
001E19BA  in          al,dx  
001E19BB  mov         eax,dword ptr [ecx+4]  
001E19BE  cmp         edx,dword ptr [eax+4]  
001E19C1  jae         001E19FF  
001E19C3  movzx       eax,word ptr [eax+edx*2+8]  
    29:             return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
001E19C8  cmp         eax,0Dh  
001E19CB  je          001E19F8  
001E19CD  cmp         eax,0Ah  
001E19D0  jne         001E19F4  
001E19D2  test        edx,edx  
001E19D4  je          001E19ED  
001E19D6  dec         edx  
001E19D7  mov         eax,dword ptr [ecx+4]  
001E19DA  cmp         edx,dword ptr [eax+4]  
001E19DD  jae         001E19FF  
001E19DF  cmp         word ptr [eax+edx*2+8],0Dh  
001E19E5  setne       al  
001E19E8  movzx       eax,al  
001E19EB  pop         ebp  
001E19EC  ret  
001E19ED  mov         eax,1  
001E19F2  pop         ebp  
001E19F3  ret  
001E19F4  xor         eax,eax  
001E19F6  pop         ebp  
001E19F7  ret  
001E19F8  mov         eax,1  
001E19FD  pop         ebp  
001E19FE  ret  
001E19FF  call        70C2E2B0  
001E1A04  int         3  

...and the following code for the loop in GetLineNumber():

    63:             index = 0;
001E1950  xor         esi,esi  
    64:             while(index < end) {
001E1952  test        ebx,ebx  
001E1954  jle         001E196C  
001E1956  mov         ecx,edi  
001E1958  mov         edx,esi  
001E195A  call        dword ptr ds:[144E10h]  
001E1960  test        eax,eax  
001E1962  je          001E1967  
    65:                 if(IsEndOfLine(index)) lineNumber++;
001E1964  inc         dword ptr [ebp-10h]  
    66:                 index++;
001E1967  inc         esi  
    64:             while(index < end) {
001E1968  cmp         esi,ebx  
001E196A  jl          001E1956  
    67:             }
    68: 
    69:             return lineNumber;
001E196C  mov         eax,dword ptr [ebp-10h]  
001E196F  pop         ecx  
001E1970  pop         ebx  
001E1971  pop         esi  
001E1972  pop         edi  
001E1973  pop         ebp  
001E1974  ret  

I'm not very good at reading Assembly code, but it seems to me that no inlining has occurred.

Question

Why doesn't the JIT compiler inline my IsEndOfLine() method even when MethodImplOptions.AggressiveInlining is specified? I know this flag is only a hint to the compiler, but based on [2] applying it should make it possible to inline IL larger than 32 bytes. Apart from that, to me, my code seems to satisfy all other conditions.

Is there some other kind of limitation that I am missing?

Benchmarks

Results:

Text length: 11645

Line: 201
Standard loop: 00:00:00.2779946 (10000 à 00:00:00.0000277)

Line: 201
Standard loop (inline): 00:00:00.1122908 (10000 à 00:00:00.0000112)

<benchmark code moved to answer for brevity>

Footnotes

1To Inline or not to Inline: That is the question

2Aggressive Inlining in the CLR 4.5 JIT


-- EDIT --

For some reason, after restarting VS, enabling and re-disabling the settings mentioned before as well as re-applying MethodImplOptions.AggressiveInlining, the method does now appear to be inlined. However, it has added a couple of instructions that aren't there when you inline the if-conditions manually.

JIT-optimized version:

    66:             while(index < end) {
001E194B  test        ebx,ebx  
001E194D  jle         001E1998  
001E194F  mov         esi,dword ptr [ecx+4]  
    67:                 if(IsEndOfLine(index)) lineNumber++;
001E1952  cmp         edx,esi  
001E1954  jae         001E19CA  
001E1956  movzx       eax,word ptr [ecx+edx*2+8]  

001E195B  cmp         eax,0Dh  
001E195E  je          001E1989  
001E1960  cmp         eax,0Ah  
001E1963  jne         001E1985  
001E1965  test        edx,edx  
001E1967  je          001E197E  
001E1969  mov         eax,edx  
001E196B  dec         eax  
001E196C  cmp         eax,esi  
001E196E  jae         001E19CA  
001E1970  cmp         word ptr [ecx+eax*2+8],0Dh  
001E1976  setne       al  
001E1979  movzx       eax,al  
001E197C  jmp         001E198E  
001E197E  mov         eax,1  
001E1983  jmp         001E198E  
001E1985  xor         eax,eax  
001E1987  jmp         001E198E  
001E1989  mov         eax,1  
001E198E  test        eax,eax  
001E1990  je          001E1993  
001E1992  inc         edi  
    68:                 index++;

My optimized version:

    87:             while(index < end) {
001E1E9B  test        ebx,ebx  
001E1E9D  jle         001E1ECE  
001E1E9F  mov         esi,dword ptr [ecx+4]  
    88:                 char c = text[index];
001E1EA2  cmp         edx,esi  
001E1EA4  jae         001E1F00  
001E1EA6  movzx       eax,word ptr [ecx+edx*2+8]  
    89:                 if(c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'))) lineNumber++;
001E1EAB  cmp         eax,0Dh  
001E1EAE  je          001E1EC8  
001E1EB0  cmp         eax,0Ah  
001E1EB3  jne         001E1EC9  
001E1EB5  test        edx,edx  
001E1EB7  je          001E1EC8  
001E1EB9  mov         eax,edx  
001E1EBB  dec         eax  
001E1EBC  cmp         eax,esi  
001E1EBE  jae         001E1F00  
001E1EC0  cmp         word ptr [ecx+eax*2+8],0Dh  
001E1EC6  je          001E1EC9  
001E1EC8  inc         edi  
    90:                 index++;

New instructions:

001E1976  setne       al  
001E1979  movzx       eax,al  
001E197C  jmp         001E198E  
001E197E  mov         eax,1  
001E1983  jmp         001E198E  
001E1985  xor         eax,eax  
001E1987  jmp         001E198E  
001E1989  mov         eax,1  
001E198E  test        eax,eax  

I still see no improvement in performance/execution speed, however... Supposedly this is due to the extra instructions that the JIT added, and I'm guessing this is as good as it gets without inlining the conditions myself?

like image 513
Visual Vincent Avatar asked Sep 19 '19 13:09

Visual Vincent


1 Answers

For some reason, after restarting VS, enabling and re-disabling the settings mentioned before as well as re-applying MethodImplOptions.AggressiveInlining, the method does now appear to be inlined (weird that it didn't before...). However it has added a couple of instructions that aren't there when you inline the if-conditions manually.

The efficiency/execution speed appears to stay the same, however. Hans Passant suggested that I replace the short-circuiting operators (where possible) with the regular | and &, which does reduce the speed gap from 2x to 1.5x. I'm guessing this is as good as it gets when it comes to JIT optimization.

return c == '\r' | (c == '\n' & (index == 0 || text[index - 1] != '\r'));

An interesting discovery I made (or, at least, interesting to me as I don't really understand how these Assembly-level optimizations work under the hood) was that when the same operator swap is done to the manually inlined conditions (inside GetLineNumberInline()), execution speed gets much worse.

The purpose of this adventure was to get as efficient code as possible without having to duplicate it everywhere I use it (since in the original code IsEndOfLine() is used multiple times throughout the project). In the end I think I'll stick to duplicating the IsEndOfLine() code only inside GetLineNumber(), as that proved to be the fastest in terms of execution speed.


I would like to thank the ones who took their time trying to help me (some comments have been removed), as while I didn't achieve the result that I thought JIT-optimized inlining would get me, I've still learned a lot that I didn't know before. Now I've at least gotten a small glimpse of what JIT optimization does under the hood and how it is much more complicated than what I could originally have imagined.


Complete benchmark results, for future reference (ordered by execution time):

Text length: 15882
Character position: 11912

Standard loop (inline):                    00:00:00.1429526 (10000 à 0.0142 ms)
Standard loop (inline unsafe):             00:00:00.1642801 (10000 à 0.0164 ms)
Standard loop (inline + no short-circuit): 00:00:00.3250843 (10000 à 0.0325 ms)
Standard loop (AggressiveInlining):        00:00:00.3318966 (10000 à 0.0331 ms)
Standard loop (unsafe):                    00:00:00.3605394 (10000 à 0.0360 ms)
Standard loop:                             00:00:00.3859629 (10000 à 0.0385 ms)
Regex (Substring):                         00:00:01.8794045 (10000 à 0.1879 ms)
Regex (MatchCollection loop):              00:00:02.4916785 (10000 à 0.2491 ms)

Resulting line: 284

/* "unsafe" is using pointers to access the string's characters */

class Program
{
    const int RUNS = 10000;

    static void Main(string[] args)
    {
        string text = "";
        Random r = new Random();

        //Some words to fill the string with.
        string[] words = new string[] { "Hello", "world", "Inventory.MaxAmount 32", "+QUICKTORETALIATE", "TNT1 AABBCC 6 A_JumpIf(ACS_ExecuteWithResult(460, 0, 0, 0) == 0, \"See0\")" };

        //Various line endings.
        string[] endings = new string[] { "\r\n", "\r", "\n" };



        /*
            Generate text
        */
        int lineCount = r.Next(256, 513);

        for(int l = 0; l < lineCount; l++) {
            int wordCount = r.Next(1, 4);
            text += new string(' ', r.Next(4, 9));

            for(int w = 0; w < wordCount; w++) {
                text += words[wordCount] + (w < wordCount - 1 ? " " : "");
            }

            text += endings[r.Next(0, endings.Length)];
        }

        Console.WriteLine("Text length: " + text.Length);
        Console.WriteLine();



        /*
            Initialize class and stopwatch
        */
        TextParser parser = new TextParser(text);
        Stopwatch sw = new Stopwatch();

        List<int> numbers = new List<int>(); //Using a list to prevent the compiler from optimizing-away the "GetLineNumber" call.



        /*
            Test 1 - Standard loop
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumber((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Standard loop: ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Test 2 - Standard loop (with AggressiveInlining)
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumber2((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Standard loop (AggressiveInlining): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Test 3 - Standard loop (with inline check)
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumberInline((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Standard loop (inline): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Test 4 - Standard loop (with inline and no short-circuiting)
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumberInline2((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Standard loop (inline + no short-circuit): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Test 5 - Standard loop (with unsafe check)
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumberUnsafe((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Standard loop (unsafe): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Test 6 - Standard loop (with inline + unsafe check)
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumberUnsafeInline((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Standard loop (inline unsafe): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Test 7 - Regex (with Substring)
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumberRegex((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Regex (Substring): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Test 8 - Regex (with MatchCollection loop)
        */
        sw.Restart();
        for(int x = 0; x < RUNS; x++) {
            numbers.Add(parser.GetLineNumberRegex2((int)(text.Length * 0.75) + r.Next(-4, 4)));
        }
        sw.Stop();

        Console.WriteLine("Line: " + numbers[0]);
        Console.WriteLine("Regex (MatchCollection loop): ".PadRight(41) + sw.Elapsed.ToString() + " (" + numbers.Count + " à " + new TimeSpan(sw.Elapsed.Ticks / numbers.Count).TotalMilliseconds.ToString() + " ms)");
        Console.WriteLine();

        numbers = new List<int>();



        /*
            Tests completed
        */
        Console.Write("All tests completed. Press ENTER to close...");
        while(Console.ReadKey(true).Key != ConsoleKey.Enter);
    }
}
public class TextParser
{
    private static readonly Regex LineRegex = new Regex("\r\n|\r|\n", RegexOptions.Compiled);

    private string text;

    public TextParser(string text)
    {
        this.text = text;
    }

    /// <summary>
    /// Returns whether the specified character index is the end of a line.
    /// </summary>
    /// <param name="index">The index to check.</param>
    /// <returns></returns>
    private bool IsEndOfLine(int index)
    {
        char c = text[index];
        return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
    }

    /// <summary>
    /// Returns whether the specified character index is the end of a line.
    /// </summary>
    /// <param name="index">The index to check.</param>
    /// <returns></returns>
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    private bool IsEndOfLineAggressiveInlining(int index)
    {
        char c = text[index];
        return c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'));
    }

    /// <summary>
    /// Returns whether the specified character index is the end of a line.
    /// </summary>
    /// <param name="index">The index to check.</param>
    /// <returns></returns>
    private bool IsEndOfLineUnsafe(int index)
    {
        unsafe
        {
            fixed(char* ptr = text) {
                char c = ptr[index];
                return c == '\r' || (c == '\n' && (index == 0 || ptr[index - 1] != '\r'));
            }
        }
    }



    /// <summary>
    /// Returns the number of the line at the specified character index.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumber(int index)
    {
        if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }

        int lineNumber = 1;
        int end = index;

        index = 0;
        while(index < end) {
            if(IsEndOfLine(index)) lineNumber++;
            index++;
        }

        return lineNumber;
    }



    /// <summary>
    /// Returns the number of the line at the specified character index.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumber2(int index)
    {
        if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }

        int lineNumber = 1;
        int end = index;

        index = 0;
        while(index < end) {
            if(IsEndOfLineAggressiveInlining(index)) lineNumber++;
            index++;
        }

        return lineNumber;
    }

    /// <summary>
    /// Returns the number of the line at the specified character index.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumberInline(int index)
    {
        if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }

        int lineNumber = 1;
        int end = index;

        index = 0;
        while(index < end) {
            char c = text[index];
            if(c == '\r' || (c == '\n' && (index == 0 || text[index - 1] != '\r'))) lineNumber++;
            index++;
        }

        return lineNumber;
    }

    /// <summary>
    /// Returns the number of the line at the specified character index.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumberInline2(int index)
    {
        if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }

        int lineNumber = 1;
        int end = index;

        index = 0;
        while(index < end) {
            char c = text[index];
            if(c == '\r' | (c == '\n' & (index == 0 || text[index - 1] != '\r'))) lineNumber++;
            index++;
        }

        return lineNumber;
    }

    /// <summary>
    /// Returns the number of the line at the specified character index.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumberUnsafe(int index)
    {
        if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }

        int lineNumber = 1;
        int end = index;

        index = 0;
        while(index < end) {
            if(IsEndOfLineUnsafe(index)) lineNumber++;
            index++;
        }

        return lineNumber;
    }

    /// <summary>
    /// Returns the number of the line at the specified character index.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumberUnsafeInline(int index)
    {
        if(index < 0 || index > text.Length) { throw new ArgumentOutOfRangeException("index"); }

        int lineNumber = 1;
        int end = index;

        unsafe
        {
            fixed(char* ptr = text) {
                index = 0;
                while(index < end) {
                    char c = ptr[index];
                    if(c == '\r' || (c == '\n' && (index == 0 || ptr[index - 1] != '\r'))) lineNumber++;
                    index++;
                }
            }
        }

        return lineNumber;
    }

    /// <summary>
    /// Returns the number of the line at the specified character index. Utilizes a Regex.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumberRegex(int index)
    {
        return LineRegex.Matches(text.Substring(0, index)).Count + 1;
    }

    /// <summary>
    /// Returns the number of the line at the specified character index. Utilizes a Regex.
    /// </summary>
    /// <param name="index">The index of the character which's line number to get.</param>
    /// <returns></returns>
    public int GetLineNumberRegex2(int index)
    {
        int lineNumber = 1;
        MatchCollection mc = LineRegex.Matches(text);

        for(int y = 0; y < mc.Count; y++) {
            if(mc[y].Index >= index) break;
            lineNumber++;
        }

        return lineNumber;
    }
}
like image 161
Visual Vincent Avatar answered Nov 05 '22 13:11

Visual Vincent