Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XOR Operator - How does it work?

Can you please explain me in plain English what is the XOR (^) operator and what it does in the following code:

public int GetHashCode(Box bx)
{
    int hCode = bx.Height ^ bx.Length ^ bx.Width;
    return hCode.GetHashCode();
} 
like image 869
Yair Nevet Avatar asked Aug 14 '13 18:08

Yair Nevet


People also ask

How does the XOR operator work?

The XOR logical operation, exclusive or, takes two boolean operands and returns true if, and only if, the operands are different. Conversely, it returns false if the two operands have the same value.

What happens when you XOR two numbers?

XOR Background As we can see, if both bits are similar, then the XOR equals to zero. Otherwise, the result equals to one. Put differently, the bitwise XOR operation equals zero in case the number of ones is even and equals one in case the number of ones is odd.


2 Answers

XOR stands for exclusive or. It ensures that either A or B is true but never both. In this case we're doing a bitwise operation so you can make a nice little graph of the outcomes, they are as follows;

0 ^ 1 = 1
1 ^ 1 = 0
1 ^ 0 = 1
0 ^ 0 = 0

Since you're applying it to integers the above outcomes are applied to each bit in the operands. So lets just say you have the values 1, 2, 3 for height, length, and width respectively.

You would first have

0001 ^ 0010 resulting in 0011 then that would be XOR'd into 3 so 0011 ^ 0011 which gives you 0000

EDIT: supplying the wiki link from the comments to supplement my explanation; http://en.wikipedia.org/wiki/Exclusive_or#Computer_science

EDIT: Why does 0001 ^ 0010 result in 0011?

So it's best to do this bit by bit. Think of the operator iterating over the two sets of bits and comparing pairs of them. So in this case lets work from right to left (least significant to most in this case).

1 ^ 0 = 1 // xxx1
0 ^ 1 = 1 // xx11
0 ^ 0 = 0 // x011
0 ^ 0 = 0 // 0011  - end of input

so piecing that back together you get 0011. Basically, take each pair of inputs and reference the truth table for the outcome. The comment shows the output with x being a value that has not yet been calculated.

With regard to collisions, yes, in this case there are plenty of collision. If I said it would be unique that was a poor word choice. What I really mean is that if you have 2, 8, 4 as your values XOR'n them in that order will always produce the same value.

like image 148
evanmcdonnal Avatar answered Oct 13 '22 14:10

evanmcdonnal


Elaborating a little bit, you see alot of XORing between fields in getHashCode() methods because its a safe way of getting an objects signature. The concept of a signature is that its like an object's fingerprint, and it needs to fit into 32 bits. This signature is used by a number of objects as a quick comparison, (though, if your planning on using it for that, take a look at that wikipedia article because you need to be careful about equality and hash-codes), or for some kind of addressing (as is the case with .net's Dictionary and Java's HashMap).

The obvious solution to me to get a Box's fingerprint is to simply add up the values, that way if any of them change you'll get a different fingerprint: bx.Height + bx.Length + bx.Width

Given that an equals operation might be really expensive (ie really slow), if we need to test the equality of two boxes:

  • Box {5, 10, 15}
  • Box {30, 40, 50}

Rather than do a full equals comparison, we can compare the two hash-code's, see that they're different, and skip the full equality comparison. In a dictionary this is critical to give us a fast method to find a bin (an element) to put the object in.

But if any of those values are too high, we could get an integer overflow exception, so instead of using addition, we use an XOR. Another solution, and one that's fairly unique to C#, is to use the unchecked{ ... } block, but using XOR is considered more elegant.

There is one more subtle thing we can do to increase performance, and you'll see this with alot of auto-generated hashcode methods (such as those produced by ReSharper or IntelliJ): we can make the order matter by shifting (multiplying) each value.

    public int hashCode() {
        int result = x;
        result = 31 * result ^ y;
        result = 31 * result ^ z;
        return result;
    }

Now whats happening is that each field in your hashcode effectively has a place in the resulting 32 bits. What this means is that the two boxes:

  • Box {1, 20, 30}
  • Box {1, 30, 20}

would not have the same hash codes (they would have the same hash codes with your current system, but they're different!)

There's more than you ever wanted to know about hash-codes but I'll say one more thing.

In both Java/Scala and the .net framework, if you overload either equals or hash-code, you must also overload the other. You must also ensure that if two objects A and B have different hash-codes, then a call to A.Equals(B) must be false.

like image 43
Groostav Avatar answered Oct 13 '22 16:10

Groostav