I came across a very funny situation where comparing a nullable type to null inside a generic method is 234x slower than comparing an value type or a reference type. The code is as follows:
static bool IsNull<T>(T instance)
{
return instance == null;
}
The execution code is:
int? a = 0;
string b = "A";
int c = 0;
var watch = Stopwatch.StartNew();
for (int i = 0; i < 1000000; i++)
{
var r1 = IsNull(a);
}
Console.WriteLine(watch.Elapsed.ToString());
watch.Restart();
for (int i = 0; i < 1000000; i++)
{
var r2 = IsNull(b);
}
Console.WriteLine(watch.Elapsed.ToString());
watch.Restart();
for (int i = 0; i < 1000000; i++)
{
var r3 = IsNull(c);
}
watch.Stop();
Console.WriteLine(watch.Elapsed.ToString());
Console.ReadKey();
The output for the code above is:
00:00:00.1879827
00:00:00.0008779
00:00:00.0008532
As you can see, comparing an nullable int to null is 234x slower than comparing an int or a string. If I add a second overload with the right constraints, the results change dramatically:
static bool IsNull<T>(T? instance) where T : struct
{
return instance == null;
}
Now the results are:
00:00:00.0006040
00:00:00.0006017
00:00:00.0006014
Why is that? I didn't check the byte code because I'm not fluent on it, but even if the byte code was a little bit different, I would expect the JIT to optimize this, and it is not (I'm running with optimizations).
A type is said to be nullable if it can be assigned a value or can be assigned null, which means the type has no value whatsoever.
Nullable variables may either contain a valid value or they may not — in the latter case they are considered to be nil . Non-nullable variables must always contain a value and cannot be nil . In Oxygene (as in C# and Java), the default nullability of a variable is determined by its type.
You typically use a nullable value type when you need to represent the undefined value of an underlying value type. For example, a Boolean, or bool , variable can only be either true or false . However, in some applications a variable value can be undefined or missing.
Some int value as an int? is definitely non-null and null is definitely null. The compiler realizes that and since a non-null value is not equal to a definite null value, the warning is given. The compiler also optimizes this away because it is always false. It won't even load the x variable at all.
Here's what you should do to investigate this.
Start by rewriting the program so that it does everything twice. Put a message box between the two iterations. Compile the program with optimizations on, and run the program not in the debugger. This ensures that the jitter generates the most optimal code it can. The jitter knows when a debugger is attached and can generate worse code to make it easier to debug if it thinks that's what you're doing.
When the message box pops up, attach the debugger and then trace at the assembly code level into the three different versions of the code, if in fact there even are three different versions. I'd be willing to bet as much as a dollar that no code whatsoever is generated for the first one, because the jitter knows that the whole thing can be optimized away to "return false", and then that return false can be inlined, and perhaps even the loop can be removed.
(In the future, you should probably consider this when writing performance tests. Remember that if you don't use the result then the jitter is free to completely optimize away everything that produces that result, as long as it has no side effect.)
Once you can look at the assembly code you'll see what's going on.
I have not investigated this myself personally, but odds are good that what is going on is this:
in the int codepath, the jitter is realizing that a boxed int is never null and turning the method into "return false"
in the string codepath, the jitter is realizing that testing a string for nullity is equivalent to testing whether the managed pointer to the string is zero, so it is generating a single instruction that tests whether a register is zero.
in the int? codepath, probably the jitter is realizing that testing an int? for nullity can be accomplished by boxing the int? -- since a boxed null int is a null reference, that then reduces to the earlier problem of testing a managed pointer against zero. But you take on the cost of the boxing.
If that's the case then the jitter could be more sophisticated here and realize that testing an int? for null can be accomplished by returning the inverse of the HasValue bool inside the int?.
But like I said, that's just a guess. Generate the code yourself and see what it's doing if you're interested.
If you compare the IL produced by the two overloads, you can see that there is boxing involved:
The first looks like:
.method private hidebysig static bool IsNull<T>(!!T instance) cil managed
{
.maxstack 2
.locals init (
[0] bool CS$1$0000)
L_0000: nop
L_0001: ldarg.0
L_0002: box !!T
L_0007: ldnull
L_0008: ceq
L_000a: stloc.0
L_000b: br.s L_000d
L_000d: ldloc.0
L_000e: ret
}
While the second looks like:
.method private hidebysig static bool IsNull<valuetype ([mscorlib]System.ValueType) .ctor T>(valuetype [mscorlib]System.Nullable`1<!!T> instance) cil managed
{
.maxstack 2
.locals init (
[0] bool CS$1$0000)
L_0000: nop
L_0001: ldarga.s instance
L_0003: call instance bool [mscorlib]System.Nullable`1<!!T>::get_HasValue()
L_0008: ldc.i4.0
L_0009: ceq
L_000b: stloc.0
L_000c: br.s L_000e
L_000e: ldloc.0
L_000f: ret
}
In the second case, the compiler knows the type is a Nullable so it can optimize for that. In the first case, it has to handle any type, both reference and value types. So it has to jump through some extra hoops.
As for why int is faster than int?, I'd imagine there are some JIT optimizations involved there.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With