Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# Why can equal decimals produce unequal hash values?

We ran into a magic decimal number that broke our hashtable. I boiled it down to the following minimal case:

decimal d0 = 295.50000000000000000000000000m;
decimal d1 = 295.5m;

Console.WriteLine("{0} == {1} : {2}", d0, d1, (d0 == d1));
Console.WriteLine("0x{0:X8} == 0x{1:X8} : {2}", d0.GetHashCode(), d1.GetHashCode()
                  , (d0.GetHashCode() == d1.GetHashCode()));

Giving the following output:

295.50000000000000000000000000 == 295.5 : True
0xBF8D880F == 0x40727800 : False

What is really peculiar: change, add or remove any of the digits in d0 and the problem goes away. Even adding or removing one of the trailing zeros! The sign doesn't seem to matter though.

Our fix is to divide the value to get rid of the trailing zeroes, like so:

decimal d0 = 295.50000000000000000000000000m / 1.000000000000000000000000000000000m;

But my question is, how is C# doing this wrong?

edit: Just noticed this has been fixed in .NET Core 3.0 (possibly earlier, I didn't check) : https://dotnetfiddle.net/4jqYos

like image 741
Jannes Avatar asked Dec 16 '11 11:12

Jannes


People also ask

What C is used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...

What is the full name of C?

In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr.

Is C language easy?

C is a general-purpose language that most programmers learn before moving on to more complex languages. From Unix and Windows to Tic Tac Toe and Photoshop, several of the most commonly used applications today have been built on C. It is easy to learn because: A simple syntax with only 32 keywords.

What is C language basics?

What is C? C is a general-purpose programming language created by Dennis Ritchie at the Bell Laboratories in 1972. It is a very popular language, despite being old. C is strongly associated with UNIX, as it was developed to write the UNIX operating system.


3 Answers

Another bug (?) that results in different bytes representation for the same decimal on different compilers: Try to compile following code on VS 2005 and then VS 2010. Or look at my article on Code Project.

class Program
{
    static void Main(string[] args)
    {
        decimal one = 1m;

        PrintBytes(one);
        PrintBytes(one + 0.0m); // compare this on different compilers!
        PrintBytes(1m + 0.0m);

        Console.ReadKey();
    }

    public static void PrintBytes(decimal d)
    {
        MemoryStream memoryStream = new MemoryStream();
        BinaryWriter binaryWriter = new BinaryWriter(memoryStream);

        binaryWriter.Write(d);

        byte[] decimalBytes = memoryStream.ToArray();

        Console.WriteLine(BitConverter.ToString(decimalBytes) + " (" + d + ")");
    }
}

Some people use following normalization code d=d+0.0000m which is not working properly on VS 2010. Your normalization code (d=d/1.000000000000000000000000000000000m) looks good - I use the same one to get the same byte array for the same decimals.

like image 44
CoperNick Avatar answered Oct 17 '22 13:10

CoperNick


To start with, C# isn't doing anything wrong at all. This is a framework bug.

It does indeed look like a bug though - basically whatever normalization is involved in comparing for equality ought to be used in the same way for hash code computation. I've checked and can reproduce it too (using .NET 4) including checking the Equals(decimal) and Equals(object) methods as well as the == operator.

It definitely looks like it's the d0 value which is the problem, as adding trailing 0s to d1 doesn't change the results (until it's the same as d0 of course). I suspect there's some corner case tripped by the exact bit representation there.

I'm surprised it isn't (and as you say, it works most of the time), but you should report the bug on Connect.

like image 139
Jon Skeet Avatar answered Oct 17 '22 13:10

Jon Skeet


Ran into this bug too ... :-(

Tests (see below) indicate that this depends on the maximum precision available for the value. The wrong hash codes only occur near the maximum precision for the given value. As the tests show the error seems to depend on the digits left of the decimal point. Sometimes the only the hashcode for maxDecimalDigits - 1 is wrong, sometimes the value for maxDecimalDigits is wrong.

var data = new decimal[] {
//    123456789012345678901234567890
    1.0m,
    1.00m,
    1.000m,
    1.0000m,
    1.00000m,
    1.000000m,
    1.0000000m,
    1.00000000m,
    1.000000000m,
    1.0000000000m,
    1.00000000000m,
    1.000000000000m,
    1.0000000000000m,
    1.00000000000000m,
    1.000000000000000m,
    1.0000000000000000m,
    1.00000000000000000m,
    1.000000000000000000m,
    1.0000000000000000000m,
    1.00000000000000000000m,
    1.000000000000000000000m,
    1.0000000000000000000000m,
    1.00000000000000000000000m,
    1.000000000000000000000000m,
    1.0000000000000000000000000m,
    1.00000000000000000000000000m,
    1.000000000000000000000000000m,
    1.0000000000000000000000000000m,
    1.00000000000000000000000000000m,
    1.000000000000000000000000000000m,
    1.0000000000000000000000000000000m,
    1.00000000000000000000000000000000m,
    1.000000000000000000000000000000000m,
    1.0000000000000000000000000000000000m,
};

for (int i = 0; i < 1000; ++i)
{
    var d0 = i * data[0];
    var d0Hash = d0.GetHashCode();
    foreach (var d in data)
    {
        var value = i * d;
        var hash = value.GetHashCode();
        Console.WriteLine("{0};{1};{2};{3};{4};{5}", d0, value, (d0 == value), d0Hash, hash, d0Hash == hash);
    }
}
like image 3
AxelEckenberger Avatar answered Oct 17 '22 12:10

AxelEckenberger