Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does the compiler add statements to the switch?

I have the following reasonably simple switch statement. // earlier string fullPath = GetFullPath(); string type = GetEntityType();

switch (type.ToLower()) {
    case "tables":
        tables.Add(fullPath);
        break;
    case "views":
        views.Add(fullPath);
        break;
    case "functions":
        functions.Add(fullPath);
        break;
    case "storedprocs":
        storedprocs.Add(fullPath);
        break;
    case "data":
        data.Add(fullPath);
        break;
    case "layouts":
        layouts.Add(fullPath);
        break;
    case "scripts":
        scripts.Add(fullPath);
        break;
    default:
        Console.WriteLine($"What is this: {type}");
        break;
}

When I decompile the resulting binary using Reflector, the switch(string) has been changed to ComputeStringHash and then inside each case statement it checks the value via the if statement. Sounds like it's doing double the work.

    string s = str2.ToLower();
    switch (<PrivateImplementationDetails>.ComputeStringHash(s))
    {
        case 0x20890fc4:
            if (s == "tables")
            {
                break;
            }
            goto Label_0218;

        case 0x454a414e:
            if (s == "functions")
            {
                goto Label_01DE;
            }
            goto Label_0218;

        case 0x4facf6d1:
            if (s == "views")
            {
                goto Label_01D3;
            }
            goto Label_0218;

        case 0xcdfe2cb3:
            if (s == "storedprocs")
            {
                goto Label_01E9;
            }
            goto Label_0218;

        case 0xd872e2a5:
            if (s == "data")
            {
                goto Label_01F4;
            }
            goto Label_0218;

        case 0x9b4a129b:
            if (s == "scripts")
            {
                goto Label_020C;
            }
            goto Label_0218;

        case 0xba971064:
            if (s == "layouts")
            {
                goto Label_0200;
            }
            goto Label_0218;

        default:
            goto Label_0218;
    }

    first.Add(fullPath);
    continue;
  Label_01D3:
      second.Add(fullPath);
      continue;
  Label_01DE:
      list3.Add(fullPath);
      continue;
  Label_01E9:
      list4.Add(fullPath);
      continue;
  Label_01F4:
      list5.Add(fullPath);
      continue;
  Label_0200:
      list6.Add(fullPath);
      continue;
  Label_020C:
      list7.Add(fullPath);
      continue;
  Label_0218:
    Console.WriteLine("What is this: " + str2);
}
like image 924
AngryHacker Avatar asked Jan 27 '23 00:01

AngryHacker


1 Answers

This is a very smart optimization, which lets the switch do its job in time that is almost independent of the number of strings in the case block of the statement.

This optimization is based on the observation that hash codes of identical strings must be the same. Rather than checking strings for equality one-by-one, the compiler computes a hash of the target string once, and performs a table-based lookup in O(1). This gets the compiler to the desired case, at which point the compiler needs to check that the strings are actually equal.

Note that there would be some rare situations when multiple look-up strings would have the same hash code. In such situations the generated case statement would contain multiple if to decide among the strings with equal hash codes.

Overall, this behavior mimics the behavior of hash-based dictionaries: hash code determines the case (an equivalent of a hash bucket) and the series of ifs inside determines if there is a match. This results in a better performance, because it lets the compiler skip the unnecessary checks.

like image 185
Sergey Kalinichenko Avatar answered Feb 12 '23 14:02

Sergey Kalinichenko