I'm just revising chapter 4 of C# in Depth which deals with nullable types, and I'm adding a section about using the "as" operator, which allows you to write:
object o = ...;
int? x = o as int?;
if (x.HasValue)
{
... // Use x.Value in here
}
I thought this was really neat, and that it could improve performance over the C# 1 equivalent, using "is" followed by a cast - after all, this way we only need to ask for dynamic type checking once, and then a simple value check.
This appears not to be the case, however. I've included a sample test app below, which basically sums all the integers within an object array - but the array contains a lot of null references and string references as well as boxed integers. The benchmark measures the code you'd have to use in C# 1, the code using the "as" operator, and just for kicks a LINQ solution. To my astonishment, the C# 1 code is 20 times faster in this case - and even the LINQ code (which I'd have expected to be slower, given the iterators involved) beats the "as" code.
Is the .NET implementation of isinst
for nullable types just really slow? Is it the additional unbox.any
that causes the problem? Is there another explanation for this? At the moment it feels like I'm going to have to include a warning against using this in performance sensitive situations...
Results:
Cast: 10000000 : 121
As: 10000000 : 2211
LINQ: 10000000 : 2143
Code:
using System;
using System.Diagnostics;
using System.Linq;
class Test
{
const int Size = 30000000;
static void Main()
{
object[] values = new object[Size];
for (int i = 0; i < Size - 2; i += 3)
{
values[i] = null;
values[i+1] = "";
values[i+2] = 1;
}
FindSumWithCast(values);
FindSumWithAs(values);
FindSumWithLinq(values);
}
static void FindSumWithCast(object[] values)
{
Stopwatch sw = Stopwatch.StartNew();
int sum = 0;
foreach (object o in values)
{
if (o is int)
{
int x = (int) o;
sum += x;
}
}
sw.Stop();
Console.WriteLine("Cast: {0} : {1}", sum,
(long) sw.ElapsedMilliseconds);
}
static void FindSumWithAs(object[] values)
{
Stopwatch sw = Stopwatch.StartNew();
int sum = 0;
foreach (object o in values)
{
int? x = o as int?;
if (x.HasValue)
{
sum += x.Value;
}
}
sw.Stop();
Console.WriteLine("As: {0} : {1}", sum,
(long) sw.ElapsedMilliseconds);
}
static void FindSumWithLinq(object[] values)
{
Stopwatch sw = Stopwatch.StartNew();
int sum = values.OfType<int>().Sum();
sw.Stop();
Console.WriteLine("LINQ: {0} : {1}", sum,
(long) sw.ElapsedMilliseconds);
}
}
You typically use a nullable value type when you need to represent the undefined value of an underlying value type. For example, a Boolean, or bool , variable can only be either true or false . However, in some applications a variable value can be undefined or missing.
When the nullable type is boxed, the underlying value type is stored in the object, rather than an instance of the nullable type itself. For example, if we box int?, the boxed value will store an int.
Nullable variables may either contain a valid value or they may not — in the latter case they are considered to be nil . Non-nullable variables must always contain a value and cannot be nil . In Oxygene (as in C# and Java), the default nullability of a variable is determined by its type.
C# provides a special data types, the nullable types, to which you can assign normal range of values as well as null values. For example, you can store any value from -2,147,483,648 to 2,147,483,647 or null in a Nullable<Int32> variable. Similarly, you can assign true, false, or null in a Nullable<bool> variable.
Clearly the machine code the JIT compiler can generate for the first case is much more efficient. One rule that really helps there is that an object can only be unboxed to a variable that has the same type as the boxed value. That allows the JIT compiler to generate very efficient code, no value conversions have to be considered.
The is operator test is easy, just check if the object isn't null and is of the expected type, takes but a few machine code instructions. The cast is also easy, the JIT compiler knows the location of the value bits in the object and uses them directly. No copying or conversion occurs, all machine code is inline and takes but about a dozen instructions. This needed to be really efficient back in .NET 1.0 when boxing was common.
Casting to int? takes a lot more work. The value representation of the boxed integer is not compatible with the memory layout of Nullable<int>
. A conversion is required and the code is tricky due to possible boxed enum types. The JIT compiler generates a call to a CLR helper function named JIT_Unbox_Nullable to get the job done. This is a general purpose function for any value type, lots of code there to check types. And the value is copied. Hard to estimate the cost since this code is locked up inside mscorwks.dll, but hundreds of machine code instructions is likely.
The Linq OfType() extension method also uses the is operator and the cast. This is however a cast to a generic type. The JIT compiler generates a call to a helper function, JIT_Unbox() that can perform a cast to an arbitrary value type. I don't have a great explanation why it is as slow as the cast to Nullable<int>
, given that less work ought to be necessary. I suspect that ngen.exe might cause trouble here.
It seems to me that the isinst
is just really slow on nullable types. In method FindSumWithCast
I changed
if (o is int)
to
if (o is int?)
which also significantly slows down execution. The only differenc in IL I can see is that
isinst [mscorlib]System.Int32
gets changed to
isinst valuetype [mscorlib]System.Nullable`1<int32>
This originally started out as a Comment to Hans Passant's excellent answer, but it got too long so I want to add a few bits here:
First, the C# as
operator will emit an isinst
IL instruction (so does the is
operator). (Another interesting instruction is castclass
, emited when you do a direct cast and the compiler knows that runtime checking cannot be ommited.)
Here is what isinst
does (ECMA 335 Partition III, 4.6):
Format: isinst typeTok
typeTok is a metadata token (a
typeref
,typedef
ortypespec
), indicating the desired class.If typeTok is a non-nullable value type or a generic parameter type it is interpreted as “boxed” typeTok.
If typeTok is a nullable type,
Nullable<T>
, it is interpreted as “boxed”T
Most importantly:
If the actual type (not the verifier tracked type) of obj is verifier-assignable-to the type typeTok then
isinst
succeeds and obj (as result) is returned unchanged while verification tracks its type as typeTok. Unlike coercions (§1.6) and conversions (§3.27),isinst
never changes the actual type of an object and preserves object identity (see Partition I).
So, the performance killer isn't isinst
in this case, but the additional unbox.any
. This wasn't clear from Hans' answer, as he looked at the JITed code only. In general, the C# compiler will emit an unbox.any
after a isinst T?
(but will omit it in case you do isinst T
, when T
is a reference type).
Why does it do that? isinst T?
never has the effect that would have been obvious, i.e. you get back a T?
. Instead, all these instructions ensure is that you have a "boxed T"
that can be unboxed to T?
. To get an actual T?
, we still need to unbox our "boxed T"
to T?
, which is why the compiler emits an unbox.any
after isinst
. If you think about it, this makes sense because the "box format" for T?
is just a "boxed T"
and making castclass
and isinst
perform the unbox would be inconsistent.
Backing up Hans' finding with some information from the standard, here it goes:
(ECMA 335 Partition III, 4.33): unbox.any
When applied to the boxed form of a value type, the
unbox.any
instruction extracts the value contained within obj (of typeO
). (It is equivalent tounbox
followed byldobj
.) When applied to a reference type, theunbox.any
instruction has the same effect ascastclass
typeTok.
(ECMA 335 Partition III, 4.32): unbox
Typically,
unbox
simply computes the address of the value type that is already present inside of the boxed object. This approach is not possible when unboxing nullable value types. BecauseNullable<T>
values are converted to boxedTs
during the box operation, an implementation often must manufacture a newNullable<T>
on the heap and compute the address to the newly allocated object.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With