I came across a very funny situation where comparing a nullable type to null inside a generic method is 234x slower than comparing an value type or a reference type. The code is as follows: <pre class="prettyprint"><code>static bool IsNull<T>(T instance) { return instance == null; } </code></pre> The execution code is: <pre class="prettyprint"><code>int? a = 0; string b = "A"; int c = 0; var watch = Stopwatch.StartNew(); for (int i = 0; i < 1000000; i++) { var r1 = IsNull(a); } Console.WriteLine(watch.Elapsed.ToString()); watch.Restart(); for (int i = 0; i < 1000000; i++) { var r2 = IsNull(b); } Console.WriteLine(watch.Elapsed.ToString()); watch.Restart(); for (int i = 0; i < 1000000; i++) { var r3 = IsNull(c); } watch.Stop(); Console.WriteLine(watch.Elapsed.ToString()); Console.ReadKey(); </code></pre> The output for the code above is: 00:00:00.1879827 00:00:00.0008779 00:00:00.0008532 As you can see, comparing an nullable int to null is 234x slower than comparing an int or a string. If I add a second overload with the right constraints, the results change dramatically: <pre class="prettyprint"><code>static bool IsNull<T>(T? instance) where T : struct { return instance == null; } </code></pre> Now the results are: 00:00:00.0006040 00:00:00.0006017 00:00:00.0006014 Why is that? I didn't check the byte code because I'm not fluent on it, but even if the byte code was a little bit different, I would expect the JIT to optimize this, and it is not (I'm running with optimizations).

Here's what you should do to investigate this. Start by rewriting the program so that it does everything twice. Put a message box between the two iterations. Compile the program with optimizations on, and run the program not in the debugger. This ensures that the jitter generates the most optimal code it can. The jitter knows when a debugger is attached and can generate worse code to make it easier to debug if it thinks that's what you're doing. When the message box pops up, attach the debugger and then trace at the assembly code level into the three different versions of the code, if in fact there even are three different versions. I'd be willing to bet as much as a dollar that no code whatsoever is generated for the first one, because the jitter knows that the whole thing can be optimized away to "return false", and then that return false can be inlined, and perhaps even the loop can be removed. (In the future, you should probably consider this when writing performance tests. Remember that if you don't use the result then the jitter is free to completely optimize away everything that produces that result, as long as it has no side effect.) Once you can look at the assembly code you'll see what's going on. I have not investigated this myself personally, but odds are good that what is going on is this: <ul> <li>in the int codepath, the jitter is realizing that a boxed int is never null and turning the method into "return false"</li> <li>in the string codepath, the jitter is realizing that testing a string for nullity is equivalent to testing whether the managed pointer to the string is zero, so it is generating a single instruction that tests whether a register is zero.</li> <li>in the int? codepath, probably the jitter is realizing that testing an int? for nullity can be accomplished by boxing the int? -- since a boxed null int is a null reference, that then reduces to the earlier problem of testing a managed pointer against zero. But you take on the cost of the boxing.</li> </ul> If that's the case then the jitter could be more sophisticated here and realize that testing an int? for null can be accomplished by returning the inverse of the HasValue bool inside the int?. But like I said, that's just a guess. Generate the code yourself and see what it's doing if you're interested.

If you compare the IL produced by the two overloads, you can see that there is boxing involved: The first looks like: <pre class="prettyprint"><code>.method private hidebysig static bool IsNull<T>(!!T instance) cil managed { .maxstack 2 .locals init ( [0] bool CS$1$0000) L_0000: nop L_0001: ldarg.0 L_0002: box !!T L_0007: ldnull L_0008: ceq L_000a: stloc.0 L_000b: br.s L_000d L_000d: ldloc.0 L_000e: ret } </code></pre> While the second looks like: <pre class="prettyprint"><code>.method private hidebysig static bool IsNull<valuetype ([mscorlib]System.ValueType) .ctor T>(valuetype [mscorlib]System.Nullable`1<!!T> instance) cil managed { .maxstack 2 .locals init ( [0] bool CS$1$0000) L_0000: nop L_0001: ldarga.s instance L_0003: call instance bool [mscorlib]System.Nullable`1<!!T>::get_HasValue() L_0008: ldc.i4.0 L_0009: ceq L_000b: stloc.0 L_000c: br.s L_000e L_000e: ldloc.0 L_000f: ret } </code></pre> In the second case, the compiler knows the type is a Nullable so it can optimize for that. In the first case, it has to handle any type, both reference and value types. So it has to jump through some extra hoops. As for why int is faster than int?, I'd imagine there are some JIT optimizations involved there.

Why is it slower to compare a nullable value type to null on a generic method with no constraints?

Tags:

performance

c#

jit

compiler-construction

nullable

I came across a very funny situation where comparing a nullable type to null inside a generic method is 234x slower than comparing an value type or a reference type. The code is as follows:

static bool IsNull<T>(T instance)
{
    return instance == null;
}

The execution code is:

int? a = 0;
string b = "A";
int c = 0;

var watch = Stopwatch.StartNew();

for (int i = 0; i < 1000000; i++)
{
    var r1 = IsNull(a);
}

Console.WriteLine(watch.Elapsed.ToString());

watch.Restart();

for (int i = 0; i < 1000000; i++)
{
    var r2 = IsNull(b);
}

Console.WriteLine(watch.Elapsed.ToString());

watch.Restart();

for (int i = 0; i < 1000000; i++)
{
    var r3 = IsNull(c);
}

watch.Stop();

Console.WriteLine(watch.Elapsed.ToString());
Console.ReadKey();

The output for the code above is:

00:00:00.1879827

00:00:00.0008779

00:00:00.0008532

As you can see, comparing an nullable int to null is 234x slower than comparing an int or a string. If I add a second overload with the right constraints, the results change dramatically:

static bool IsNull<T>(T? instance) where T : struct
{
    return instance == null;
}

Now the results are:

00:00:00.0006040

00:00:00.0006017

00:00:00.0006014

Why is that? I didn't check the byte code because I'm not fluent on it, but even if the byte code was a little bit different, I would expect the JIT to optimize this, and it is not (I'm running with optimizations).

509

asked Apr 18 '11 19:04

Diego Frata

2 Answers

Here's what you should do to investigate this.

Start by rewriting the program so that it does everything twice. Put a message box between the two iterations. Compile the program with optimizations on, and run the program not in the debugger. This ensures that the jitter generates the most optimal code it can. The jitter knows when a debugger is attached and can generate worse code to make it easier to debug if it thinks that's what you're doing.

When the message box pops up, attach the debugger and then trace at the assembly code level into the three different versions of the code, if in fact there even are three different versions. I'd be willing to bet as much as a dollar that no code whatsoever is generated for the first one, because the jitter knows that the whole thing can be optimized away to "return false", and then that return false can be inlined, and perhaps even the loop can be removed.

(In the future, you should probably consider this when writing performance tests. Remember that if you don't use the result then the jitter is free to completely optimize away everything that produces that result, as long as it has no side effect.)

Once you can look at the assembly code you'll see what's going on.

I have not investigated this myself personally, but odds are good that what is going on is this:

in the int codepath, the jitter is realizing that a boxed int is never null and turning the method into "return false"
in the string codepath, the jitter is realizing that testing a string for nullity is equivalent to testing whether the managed pointer to the string is zero, so it is generating a single instruction that tests whether a register is zero.
in the int? codepath, probably the jitter is realizing that testing an int? for nullity can be accomplished by boxing the int? -- since a boxed null int is a null reference, that then reduces to the earlier problem of testing a managed pointer against zero. But you take on the cost of the boxing.

If that's the case then the jitter could be more sophisticated here and realize that testing an int? for null can be accomplished by returning the inverse of the HasValue bool inside the int?.

But like I said, that's just a guess. Generate the code yourself and see what it's doing if you're interested.

answered Sep 22 '22 22:09

Eric Lippert

If you compare the IL produced by the two overloads, you can see that there is boxing involved:

The first looks like:

.method private hidebysig static bool IsNull<T>(!!T instance) cil managed
{
    .maxstack 2
    .locals init (
        [0] bool CS$1$0000)
    L_0000: nop 
    L_0001: ldarg.0 
    L_0002: box !!T
    L_0007: ldnull 
    L_0008: ceq 
    L_000a: stloc.0 
    L_000b: br.s L_000d
    L_000d: ldloc.0 
    L_000e: ret 
}

While the second looks like:

.method private hidebysig static bool IsNull<valuetype ([mscorlib]System.ValueType) .ctor T>(valuetype [mscorlib]System.Nullable`1<!!T> instance) cil managed
{
    .maxstack 2
    .locals init (
        [0] bool CS$1$0000)
    L_0000: nop 
    L_0001: ldarga.s instance
    L_0003: call instance bool [mscorlib]System.Nullable`1<!!T>::get_HasValue()
    L_0008: ldc.i4.0 
    L_0009: ceq 
    L_000b: stloc.0 
    L_000c: br.s L_000e
    L_000e: ldloc.0 
    L_000f: ret 
}

In the second case, the compiler knows the type is a Nullable so it can optimize for that. In the first case, it has to handle any type, both reference and value types. So it has to jump through some extra hoops.

As for why int is faster than int?, I'd imagine there are some JIT optimizations involved there.

answered Sep 23 '22 22:09

CodeNaked

Related questions
                            
                                Thread Safety of .NET Encryption Classes?
                            
                                c# modifying structs in a List<T>
                            
                                How to marshal a variable sized array of structs? C# and C++ interop help
                            
                                Are there any performance issues or caveats with resource (.resx) files? [closed]
                            
                                How to determine which fields where changed in a Linq-To-Sql Object
                            
                                .NET Framework on Android
                            
                                How to clear the DataContext cache on Linq to Sql
                            
                                If an extension method has the same signature as a method in the sealed class, what is the call precedence?
                            
                                Assigning property of anonymous type via anonymous method
                            
                                How do I build a JSON object to send to an AJAX WebService?
                            
                                Updating fields of values in a ConcurrentDictionary
                            
                                Inject Array of Interfaces in Ninject
                            
                                TypeInitializationException when starting Windows Service because config section can not be created
                            
                                Lucene as data store
                            
                                Why does WebClient.DownloadStringTaskAsync() block ? - new async API/syntax/CTP
                            
                                How do I register a .NET COM DLL with Regsvr32?
                            
                                utf-8 in uppercase?
                            
                                .NET 2.0 : File.AppendAllText(...) - Thread safe implementation
                            
                                How to refactor these functions which have one line difference
                            
                                C# How to use DataAnnotations StringLength and SubString to remove text

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With