Implementing GetHashCode in C#. Null-value handling

Tags:

Before I begin, all code samples here I tested on Mono environment and there is one noticeable difference in the GetHashCode implementations:

string.Empty.GetHashCode(); // returns 0 in Mono 3.10
string.Empty.GetHashCode(); // returns 757602046 in .NET 4.5.1

I made my implementation based on this SO Answer by @JonSkeet and in the comments he also suggests to use 0 hash code for NULL values (wasn't sure how should I hash them).

I usually use 0 as the effective hash code for null - which isn't the same as ignoring the field.

So having following implementation (Mono 3.10):

public class Entity {
    public int EntityID { get; set; }
    public string EntityName { get; set; }

    public override int GetHashCode() {
        unchecked {
            int hash = 15485863;       // prime number
            int multiplier = 1299709;  // another prime number

            hash = hash * multiplier + EntityID.GetHashCode();
            hash = hash * multiplier + (EntityName != null ? EntityName.GetHashCode() : 0);

            return hash;
        }
    }
}

It is quite easy to find collision e.g.

var hash1 = new Entity { EntityID = 1337, EntityName = "" }.GetHashCode();
var hash2 = new Entity { EntityID = 1337, EntityName = null }.GetHashCode();

bool equals = hash1 == hash2; // true

I could replace null-value 0 with some other number, however it won't fix the problem as there still is a chance that some hash(string) output will generate such number and I'll get another collision.

My question: How should I handle null values while using algorithm from example above?

774

asked Jul 03 '15 12:07

Andriy Horen

2 Answers

Your "problem" here is that you are trying the get collision free hash codes. While this is perfect for the lookup performance of collection implementations that use the hash code for lookup (e.g. HashSet and Dictionary) in the most cases this will not work.

The reason for that is that the hash code is just a 32-bit integer value and it represents data that is usually a lot bigger (multiple integer values, strings, etc.).

So the hash code is only there to define that two objects could be equal. The collection classes use the hash code to refine the area where the object is stored and use the equals function to find if two objects are really the same. For that reason you should always implement the Equals function for classes you implemented the hash code for. While those classes will fall back to the equals function of object, is it also a good idea to implement the IEquatable<T> interface to avoid typing problems of any kind (still overwrite the default equals method of Object!)

123

answered Sep 28 '22 16:09

Nitram

My question: How should I handle null values while using algorithm from example above?

I don't think the problem is with null per-se. The problem lays in the fact you're using GetHashCode for equality, which it isn't meant for. GetHashCode should provide such hashes that aspire to normal distribution.

The docs say:

Two objects that are equal return hash codes that are equal. However, the reverse is not true: equal hash codes do not imply object equality, because different (unequal) objects can have identical hash codes.

And then goes on to specify the purpose of GetHashCode:

A hash code is intended for efficient insertion and lookup in collections that are based on a hash table.

You should be implementing IEquatable<Entity>, where you actually define the equivalence relation of two entities. And override != and == while you're at it.

An approximation:

public class Entity : IEquatable<Entity>
{
    public int EntityId { get; set; }
    public string EntityName { get; set; }

    public bool Equals(Entity other)
    {
        if (ReferenceEquals(null, other)) return false;
        if (ReferenceEquals(this, other)) return true;
        return EntityId == other.EntityId && 
               string.Equals(EntityName, other.EntityName);
    }

    public override bool Equals(object obj)
    {
        if (ReferenceEquals(null, obj)) return false;
        if (ReferenceEquals(this, obj)) return true;
        if (obj.GetType() != this.GetType()) return false;
        return Equals((Entity) obj);
    }

    public static bool operator ==(Entity left, Entity right)
    {
        return Equals(left, right);
    }

    public static bool operator !=(Entity left, Entity right)
    {
        return !Equals(left, right);
    }

    public override int GetHashCode()
    {
        unchecked
        {
            return (EntityId*397) ^ (EntityName != null ? EntityName.GetHashCode() : 0);
        }
    }
}

answered Sep 28 '22 16:09

Yuval Itzchakov

Related questions
                            
                                Getting Unhandled Exception: System.Reflection.TargetParameterCountException: Parameter count mismatch
                            
                                Autofac: How do you register a generic with multiple type arguments? Registering IGenRepo<T, TKey> with EFGenRepo<T, TKey>
                            
                                ImmutableHashSet .Contains returns false
                            
                                Pointer of generic type
                            
                                Abort a thread which is running a long query
                            
                                Windows Form: change dataGridView's first cell origin?
                            
                                How to find string value is in exponential format in C#?
                            
                                how to find out if a specific table exists in a dataset
                            
                                How to set SQLite(SQLite.NET component) database Version in Xamarin Android
                            
                                Nexus and uploading artifacts for .NET Applications
                            
                                EPPlus Excel Change cell color
                            
                                DebuggerStepThrough attribute equivalent for properties?
                            
                                C# / WPF Unmask password inside the passwordBox
                            
                                How to implement FIND method of EF in Unit Test?
                            
                                C# DateTime AddDays Unexpected Offset
                            
                                Invalid signature with Bittrex API calls in C#
                            
                                How to convert optional pointer argument from C++ code to C#
                            
                                c# MailKit - Read receipt
                            
                                Using IApplicationBuilder.Map generates nested paths with UseMvc
                            
                                Why is Enumerable Min or Max inconsistent between collections of reference and value types?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Implementing GetHashCode in C#. Null-value handling

Tags:

c#

.net

hash

Andriy Horen

People also ask

2 Answers

Nitram

Yuval Itzchakov

Recent Activity

Donate For Us