Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# switch statement limitations - why?

People also ask

Bahasa C digunakan untuk apa?

Meskipun C dibuat untuk memprogram sistem dan jaringan komputer namun bahasa ini juga sering digunakan dalam mengembangkan software aplikasi. C juga banyak dipakai oleh berbagai jenis platform sistem operasi dan arsitektur komputer, bahkan terdapat beberepa compiler yang sangat populer telah tersedia.

Apa yang dimaksud dengan huruf C?

C adalah huruf ketiga dalam alfabet Latin. Dalam bahasa Indonesia, huruf ini disebut ce (dibaca [tʃe]).


It's important not to confuse the C# switch statement with the CIL switch instruction.

The CIL switch is a jump table, that requires an index into a set of jump addresses.

This is only useful if the C# switch's cases are adjacent:

case 3: blah; break;
case 4: blah; break;
case 5: blah; break;

But of little use if they aren't:

case 10: blah; break;
case 200: blah; break;
case 3000: blah; break;

(You'd need a table ~3000 entries in size, with only 3 slots used)

With non-adjacent expressions, the compiler may start to perform linear if-else-if-else checks.

With larger non- adjacent expression sets, the compiler may start with a binary tree search, and finally if-else-if-else the last few items.

With expression sets containing clumps of adjacent items, the compiler may binary tree search, and finally a CIL switch.

This is full of "mays" & "mights", and it is dependent on the compiler (may differ with Mono or Rotor).

I replicated your results on my machine using adjacent cases:

total time to execute a 10 way switch, 10000 iterations (ms) : 25.1383
approximate time per 10 way switch (ms) : 0.00251383

total time to execute a 50 way switch, 10000 iterations (ms) : 26.593
approximate time per 50 way switch (ms) : 0.0026593

total time to execute a 5000 way switch, 10000 iterations (ms) : 23.7094
approximate time per 5000 way switch (ms) : 0.00237094

total time to execute a 50000 way switch, 10000 iterations (ms) : 20.0933
approximate time per 50000 way switch (ms) : 0.00200933

Then I also did using non-adjacent case expressions:

total time to execute a 10 way switch, 10000 iterations (ms) : 19.6189
approximate time per 10 way switch (ms) : 0.00196189

total time to execute a 500 way switch, 10000 iterations (ms) : 19.1664
approximate time per 500 way switch (ms) : 0.00191664

total time to execute a 5000 way switch, 10000 iterations (ms) : 19.5871
approximate time per 5000 way switch (ms) : 0.00195871

A non-adjacent 50,000 case switch statement would not compile.
"An expression is too long or complex to compile near 'ConsoleApplication1.Program.Main(string[])'

What's funny here, is that the binary tree search appears a little (probably not statistically) quicker than the CIL switch instruction.

Brian, you've used the word "constant", which has a very definite meaning from a computational complexity theory perspective. While the simplistic adjacent integer example may produce CIL that is considered O(1) (constant), a sparse example is O(log n) (logarithmic), clustered examples lie somewhere in between, and small examples are O(n) (linear).

This doesn't even address the String situation, in which a static Generic.Dictionary<string,int32> may be created, and will suffer definite overhead on first use. Performance here will be dependent on the performance of Generic.Dictionary.

If you check the C# Language Specification (not the CIL spec) you'll find "15.7.2 The switch statement" makes no mention of "constant time" or that the underlying implementation even uses the CIL switch instruction (be very careful of assuming such things).

At the end of the day, a C# switch against an integer expression on a modern system is a sub-microsecond operation, and not normally worth worrying about.


Of course these times will depend on machines and conditions. I wouldn’t pay attention to these timing tests, the microsecond durations we’re talking about are dwarfed by any “real” code being run (and you must include some “real code” otherwise the compiler will optimise the branch away), or jitter in the system. My answers are based on using IL DASM to examine the CIL created by the C# compiler. Of course, this isn’t final, as the actual instructions the CPU runs are then created by the JIT.

I have checked the final CPU instructions actually executed on my x86 machine, and can confirm a simple adjacent set switch doing something like:

  jmp     ds:300025F0[eax*4]

Where a binary tree search is full of:

  cmp     ebx, 79Eh
  jg      3000352B
  cmp     ebx, 654h
  jg      300032BB
  …
  cmp     ebx, 0F82h
  jz      30005EEE

This is my original post, which sparked some debate... because it is wrong:

The switch statement is not the same thing as a big if-else statement. Each case must be unique and evaluated statically. The switch statement does a constant time branch regardless of how many cases you have. The if-else statement evaluates each condition until it finds one that is true.


In fact, the C# switch statement is not always a constant time branch.

In some cases the compiler will use a CIL switch statement which is indeed a constant time branch using a jump table. However, in sparse cases as pointed out by Ivan Hamilton the compiler may generate something else entirely.

This is actually quite easy to verify by writing various C# switch statements, some sparse, some dense, and looking at the resulting CIL with the ildasm.exe tool.


The first reason that comes to mind is historical:

Since most C, C++, and Java programmers are not accustomed to having such freedoms, they do not demand them.

Another, more valid, reason is that the language complexity would increase:

First of all, should the objects be compared with .Equals() or with the == operator? Both are valid in some cases. Should we introduce new syntax to do this? Should we allow the programmer to introduce their own comparison method?

In addition, allowing to switch on objects would break underlying assumptions about the switch statement. There are two rules governing the switch statement that the compiler would not be able to enforce if objects were allowed to be switched on (see the C# version 3.0 language specification, §8.7.2):

  • That the values of switch labels are constant
  • That the values of switch labels are distinct (so that only one switch block can be selected for a given switch-expression)

Consider this code example in the hypothetical case that non-constant case values were allowed:

void DoIt()
{
    String foo = "bar";
    Switch(foo, foo);
}

void Switch(String val1, String val2)
{
    switch ("bar")
    {
        // The compiler will not know that val1 and val2 are not distinct
        case val1:
            // Is this case block selected?
            break;
        case val2:
            // Or this one?
            break;
        case "bar":
            // Or perhaps this one?
            break;
    }
}

What will the code do? What if the case statements are reordered? Indeed, one of the reasons why C# made switch fall-through illegal is that the switch statements could be arbitrarily rearranged.

These rules are in place for a reason - so that the programmer can, by looking at one case block, know for certain the precise condition under which the block is entered. When the aforementioned switch statement grows into 100 lines or more (and it will), such knowledge is invaluable.


By the way, VB, having the same underlying architecture, allows much more flexible Select Case statements (the above code would work in VB) and still produces efficient code where this is possible so the argument by techical constraint has to be considered carefully.


Mostly, those restrictions are in place because of language designers. The underlying justification may be compatibility with languange history, ideals, or simplification of compiler design.

The compiler may (and does) choose to:

  • create a big if-else statement
  • use a MSIL switch instruction (jump table)
  • build a Generic.Dictionary<string,int32>, populate it on first use, and call Generic.Dictionary<>::TryGetValue() for a index to pass to a MSIL switch instruction (jump table)
  • use a combination of if-elses & MSIL "switch" jumps

The switch statement IS NOT a constant time branch. The compiler may find short-cuts (using hash buckets, etc), but more complicated cases will generate more complicated MSIL code with some cases branching out earlier than others.

To handle the String case, the compiler will end up (at some point) using a.Equals(b) (and possibly a.GetHashCode() ). I think it would be trival for the compiler to use any object that satisfies these constraints.

As for the need for static case expressions... some of those optimisations (hashing, caching, etc) would not be available if the case expressions weren't deterministic. But we've already seen that sometimes the compiler just picks the simplistic if-else-if-else road anyway...

Edit: lomaxx - Your understanding of the "typeof" operator is not correct. The "typeof" operator is used to obtain the System.Type object for a type (nothing to do with its supertypes or interfaces). Checking run-time compatibility of an object with a given type is the "is" operator's job. The use of "typeof" here to express an object is irrelevant.


Microsoft finally heard you!

Now with C# 7 you can:

switch(shape)
{
case Circle c:
    WriteLine($"circle with radius {c.Radius}");
    break;
case Rectangle s when (s.Length == s.Height):
    WriteLine($"{s.Length} x {s.Height} square");
    break;
case Rectangle r:
    WriteLine($"{r.Length} x {r.Height} rectangle");
    break;
default:
    WriteLine("<unknown shape>");
    break;
case null:
    throw new ArgumentNullException(nameof(shape));
}