Given that collections like System.Collections.Generic.HashSet<>
accept null
as a set member, one can ask what the hash code of null
should be. It looks like the framework uses 0
:
// nullable struct type int? i = null; i.GetHashCode(); // gives 0 EqualityComparer<int?>.Default.GetHashCode(i); // gives 0 // class type CultureInfo c = null; EqualityComparer<CultureInfo>.Default.GetHashCode(c); // gives 0
This can be (a little) problematic with nullable enums. If we define
enum Season { Spring, Summer, Autumn, Winter, }
then the Nullable<Season>
(also called Season?
) can take just five values, but two of them, namely null
and Season.Spring
, have the same hash code.
It is tempting to write a "better" equality comparer like this:
class NewNullEnumEqComp<T> : EqualityComparer<T?> where T : struct { public override bool Equals(T? x, T? y) { return Default.Equals(x, y); } public override int GetHashCode(T? x) { return x.HasValue ? Default.GetHashCode(x) : -1; } }
But is there any reason why the hash code of null
should be 0
?
EDIT/ADDITION:
Some people seem to think this is about overriding Object.GetHashCode()
. It really is not, actually. (The authors of .NET did make an override of GetHashCode()
in the Nullable<>
struct which is relevant, though.) A user-written implementation of the parameterless GetHashCode()
can never handle the situation where the object whose hash code we seek is null
.
This is about implementing the abstract method EqualityComparer<T>.GetHashCode(T)
or otherwise implementing the interface method IEqualityComparer<T>.GetHashCode(T)
. Now, while creating these links to MSDN, I see that it says there that these methods throw an ArgumentNullException
if their sole argument is null
. This must certainly be a mistake on MSDN? None of .NET's own implementations throw exceptions. Throwing in that case would effectively break any attempt to add null
to a HashSet<>
. Unless HashSet<>
does something extraordinary when dealing with a null
item (I will have to test that).
NEW EDIT/ADDITION:
Now I tried debugging. With HashSet<>
, I can confirm that with the default equality comparer, the values Season.Spring
and null
will end in the same bucket. This can be determined by very carefully inspecting the private array members m_buckets
and m_slots
. Note that the indices are always, by design, offset by one.
The code I gave above does not, however, fix this. As it turns out, HashSet<>
will never even ask the equality comparer when the value is null
. This is from the source code of HashSet<>
:
// Workaround Comparers that throw ArgumentNullException for GetHashCode(null). private int InternalGetHashCode(T item) { if (item == null) { return 0; } return m_comparer.GetHashCode(item) & Lower31BitMask; }
This means that, at least for HashSet<>
, it is not even possible to change the hash of null
. Instead, a solution is to change the hash of all the other values, like this:
class NewerNullEnumEqComp<T> : EqualityComparer<T?> where T : struct { public override bool Equals(T? x, T? y) { return Default.Equals(x, y); } public override int GetHashCode(T? x) { return x.HasValue ? 1 + Default.GetHashCode(x) : /* not seen by HashSet: */ 0; } }
hash code of the null key is 0.
In C#, you can assign the null value to any reference variable. The null value simply means that the variable does not refer to an object in memory. You can use it like this: Circle c = new Circle(42); Circle copy = null; // Initialized ... if (copy == null) { copy = c; // copy and c refer to the same object ... }
NO! A hash code is not an id, and it doesn't return a unique value. This is kind of obvious, when you think about it: GetHashCode returns an Int32 , which has “only” about 4.2 billion possible values, and there's potentially an infinity of different objects, so some of them are bound to have the same hash code.
So long as the hash code returned for nulls is consistent for the type, you should be fine. The only requirement for a hash code is that two objects that are considered equal share the same hash code.
Returning 0 or -1 for null, so long as you choose one and return it all the time, will work. Obviously, non-null hash codes should not return whatever value you use for null.
Similar questions:
GetHashCode on null fields?
What should GetHashCode return when object's identifier is null?
The "Remarks" of this MSDN entry goes into more detail around the hash code. Poignantly, the documentation does not provide any coverage or discussion of null values at all - not even in the community content.
To address your issue with the enum, either re-implement the hash code to return non-zero, add a default "unknown" enum entry equivalent to null, or simply don't use nullable enums.
Interesting find, by the way.
Another problem I see with this generally is that the hash code cannot represent a 4 byte or larger type that is nullable without at least one collision (more as the type size increases). For example, the hash code of an int is just the int, so it uses the full int range. What value in that range do you choose for null? Whatever one you pick will collide with the value's hash code itself.
Collisions in and of themselves are not necessarily a problem, but you need to know they are there. Hash codes are only used in some circumstances. As stated in the docs on MSDN, hash codes are not guaranteed to return different values for different objects so shouldn't be expected to.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With