Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Visual Studio add "-1937169414" to a generated hash code computation?

If you use Visual Studio's own refactoring menu to add a GetHashCode implementation to a class like this:

Generate GetHashCode menu

and select the only int property in the class:

Member selection screen

it generates this code on .NET Framework:

public override int GetHashCode()
{
    return -1937169414 + Value.GetHashCode();
}

(it generates HashCode.Combine(Value) on .NET Core instead, which I'm not sure if it involves the same value)

What's special about this value? Why doesn't Visual Studio use Value.GetHashCode() directly? As I understand, it doesn't really affect hash distribution. Since it's just addition, consecutive values would still accumulate together.

EDIT: I only tried this with different classes with Value properties but apparently property name affects the number generated. For instance, if you rename the property to Halue, the number becomes 387336856. Thanks to Gökhan Kurt who pointed this out.

like image 687
Sedat Kapanoglu Avatar asked Apr 30 '20 07:04

Sedat Kapanoglu


People also ask

Should I override GetHashCode?

Why is it important to override GetHashCode ? It s important to implement both equals and gethashcode, due to collisions, in particular while using dictionaries. if two object returns same hashcode, they are inserted in the dictionary with chaining. While accessing the item equals method is used.

When should we override GetHashCode?

If you're implementing a reference type, you should consider overriding the Equals method if your type looks like a base type, such as Point, String, BigNumber, and so on. Override the GetHashCode method to allow a type to work correctly in a hash table.

Does Hashcode change C#?

NO! A hash code is not an id, and it doesn't return a unique value. This is kind of obvious, when you think about it: GetHashCode returns an Int32 , which has “only” about 4.2 billion possible values, and there's potentially an infinity of different objects, so some of them are bound to have the same hash code.

What is the significance of the source code hash?

The actual hash values of the source code files produced by the compiler literally become the unique identifiers of the source code files that are used to compile an executable.

Which Visual Studio 2015 compilers support SHA-256 hashing of source code files?

This GUID is defined in the Portable PDB Format Specification v0.1 ( bit.ly/2hVYfEX ). The following Visual Studio 2015 compilers support the option for the SHA-256 hashing of source code files: These compilers are available from the “Developer Command Prompt for VS2015” command window of Visual Studio 2015.

How to build and debug the generated code in Visual Studio?

In Microsoft Visual Studio, open the rtwdemo_counter_msvc.sln solution file. Use the solution to build and debug the generated code in Visual Studio. You have a modified version of this example.

How do I generate equals and gethashcode from an object?

Select Generate Equals (object) or Generate Equals and GetHashCode from the drop-down menu. In the Pick members dialog box, select the members you want to generate the methods for:


1 Answers

If you look for -1521134295 in Microsoft's repositories you'll see that it appears quite a number of times

  • https://github.com/search?q=org%3Amicrosoft+%22-1521134295%22+OR+0xa5555529&type=Code
  • https://github.com/search?q=org%3Adotnet++%22-1521134295%22+OR+0xa5555529&type=Code

Most of the search results are in the GetHashCode functions, but they all have the following form

int hashCode = SOME_CONSTANT;
hashCode = hashCode * -1521134295 + field1.GetHashCode();
hashCode = hashCode * -1521134295 + field2.GetHashCode();
// ...
return hashCode;

The first hashCode * -1521134295 = SOME_CONSTANT * -1521134295 will be pre-multiplied during the generation time by the generator or during compilation time by CSC. That's the reason for -1937169414 in your code

Digging deeper into the results reveals the code generation part which can be found in the function CreateGetHashCodeMethodStatements

const int hashFactor = -1521134295;

var initHash = 0;
var baseHashCode = GetBaseGetHashCodeMethod(containingType);
if (baseHashCode != null)
{
    initHash = initHash * hashFactor + Hash.GetFNVHashCode(baseHashCode.Name);
}

foreach (var symbol in members)
{
    initHash = initHash * hashFactor + Hash.GetFNVHashCode(symbol.Name);
}

As you can see the hash depends on the symbol names. In that function the constant is also called permuteValue, probably because after the multiplication the bits are permuted around somehow

// -1521134295
var permuteValue = CreateLiteralExpression(factory, hashFactor);

There are some patterns if we view the value in binary: 101001 010101010101010 101001 01001 or 10100 1010101010101010 10100 10100 1. But if we multiply an arbitrary value with that then there are lots of overlapping carries so I couldn't see how it works. The output may also has different number of set bits so it's not really a permutation

You can find the another generator in Roslyn's AnonymousTypeGetHashCodeMethodSymbol which calls the constant HASH_FACTOR

//  Method body:
//
//  HASH_FACTOR = 0xa5555529;
//  INIT_HASH = (...((0 * HASH_FACTOR) + GetFNVHashCode(backingFld_1.Name)) * HASH_FACTOR
//                                     + GetFNVHashCode(backingFld_2.Name)) * HASH_FACTOR
//                                     + ...
//                                     + GetFNVHashCode(backingFld_N.Name)

The real reason for choosing that value is yet still unclear

like image 132
phuclv Avatar answered Oct 02 '22 15:10

phuclv