What are the technical reasons behind the difference between the 32-bit and 64-bit versions of string.GetHashCode()?
More importantly, why does the 64-bit version seem to terminate its algorithm when it encounters the NUL character? For example, the following expressions all return true when run under the 64-bit CLR.
"\0123456789".GetHashCode() == "\0987654321".GetHashCode() "\0AAAAAAAAA".GetHashCode() == "\0BBBBBBBBB".GetHashCode() "\0The".GetHashCode() == "\0Game".GetHashCode()
This behavior (bug?) manifested as a performance issue when we used such strings as keys in a Dictionary.
NO! A hash code is not an id, and it doesn't return a unique value. This is kind of obvious, when you think about it: GetHashCode returns an Int32 , which has “only” about 4.2 billion possible values, and there's potentially an infinity of different objects, so some of them are bound to have the same hash code.
The key point is that the hash codes are deterministic for a given program execution, that means the only time it'll be an issue is if you're saving the hash code outside of a process, and loading it into another one.
Getting the hash code of a string is simple in C#. We use the GetHashCode() method. A hash code is a uniquely identified numerical value. Note that strings that have the same value have the same hash code.
This looks like a known issue which Microsoft would not fix:
As you have mentioned this would be a breaking change for some programs (even though they shouldn't really be relying on this), the risk of this was deemed too high to fix this in the current release.
I agree that the rate of collisions that this will cause in the default Dictionary<String, Object> will be inflated by this. If this is adversely effecting your applications performance, I would suggest trying to work around it by using one of the Dictionary constructors that takes an IEqualityComparer so you can provide a more appropriate GetHashCode implementation. I know this isn't ideal and would like to get this fixed in a future version of the .NET Framework.
Source: Microsoft Connect - String.GetHashCode ignores any characters in the string beyond the first null byte in x64 runtime
Eric lippert has got a wondeful blog to this Curious property in String
Curious property Revealed
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With