Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the xor operator used in computing hash code? [duplicate]

Tags:

c#

.net

In this MSDN article http://msdn.microsoft.com/en-us/library/ms132123.aspx it discusses the Class Equalitycomparer and has an example.In this example about comparing boxes it has this class -

class BoxSameDimensions : EqualityComparer<Box>
{
    public override bool Equals(Box b1, Box b2)
    {
        if (b1.Height == b2.Height & b1.Length == b2.Length
            & b1.Width == b2.Width)
        {
            return true;
        }
        else
        {
            return false;
        }
    }

    public override int GetHashCode(Box bx)
    {
        int hCode = bx.Height ^ bx.Length ^ bx.Width;
        return hCode.GetHashCode();
    }
}

I don't understand the line int hCode = bx.Height ^ bx.Length ^ bx.Width;

Could someone explain please? Why the xor?

like image 606
Paul Stanley Avatar asked Oct 02 '13 14:10

Paul Stanley


1 Answers

The ^ operator is the bitwise exclusive-or operator.

In this case it's being used as a convenient way to generate a hash code from three integers. (I don't think it's a very good way, but that's a different issue...)

Weirdly, after constructing a hash code, they use GetHashCode() on it again, which is utterly pointless for an int because it will just return the int itself - so it's a no-op.

This is how they should have written it:

public override int GetHashCode(Box bx)
{
    return bx.Height ^ bx.Length ^ bx.Width;
}

This SO answer explains why XOR works quite well sometimes: Why are XOR often used in java hashCode() but another bitwise operators are used rarely?

Note: The reason I don't like using xor for a hash code for three ints like that is because:

a ^ b ^ a == b

In other words if the first and last ints contributing to the hash code are the same, they do not contribute to the final hash code at all - they cancel each other out and the result is always the middle int.

It's even worse if you are only using two ints because:

a ^ a == 0

So for two ints, for all cases where they are the same the hash code will be zero.

like image 61
Matthew Watson Avatar answered Sep 30 '22 14:09

Matthew Watson