Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does .NET 4.0 sort this array differently than .NET 3.5?

Tags:

c#

.net

This stackoverflow question raised an interesting question about sorting double arrays with NaN values. The OP posted the following code:

static void Main(string[] args)
{
    double[] someArray = { 4.0, 2.0, double.NaN, 1.0, 5.0, 3.0, double.NaN, 10.0, 9.0, 8.0 };

    foreach (double db in someArray)
    {
        Console.WriteLine(db);
    }

    Array.Sort(someArray);
    Console.WriteLine("\n\n");
    foreach (double db in someArray)
    {
        Console.WriteLine(db);
    }

    Console.ReadLine();
}

When you run this under the .NET 3.5 framework, the array is sorted as follows:

1,4,NaN,2,3,5,8,9,10,NaN

When you run it under .NET 4.0, the array is sorted somewhat more logically:

NaN,NaN,1,2,3,4,5,8,9,10

I can understand why it would sort weirdly in .NET 3.5 (because NaN is not equal to, less than, or greater than, anything). I can also understand why it would sort the way it does in .NET 4.0. My question is, why did this change from 3.5 to 4.0? And where is the Microsoft documentation for this change?

like image 398
Phil Avatar asked Feb 25 '11 23:02

Phil


3 Answers

It's a bug fix. The feedback report with the bug details is here. Microsoft's response to the bug report:

Note that this bug affects the following:

  • Array.Sort(), where the array contains Double.NaN
  • Array.Sort(), where the array contains Single.NaN
  • any callers of above, for example on List.Sort(), where list contains Double.NaN

This bug will be fixed in the next major version of the runtime; until then you can work around this by using a custom IComparer that does the correct sorting. As mentioned in the workaround comments, don't use Comparer.Default, because this is special-cased with a shortcut sort routine that doesn't handle NaN correctly. Instead, you can provide your own comparer that provides an equivalent comparision, but won't be special-cased.

like image 169
Hans Passant Avatar answered Nov 06 '22 06:11

Hans Passant


Not really an answer, but perhaps a clue... You can reproduce the weird 3.5 behavior in 4.0 with this code:

void Main()
{
    double[] someArray = { 4.0, 2.0, double.NaN, 1.0, 5.0, 3.0, double.NaN, 10.0, 9.0, 8.0 };
    Array.Sort(someArray, CompareDouble);
    someArray.Dump();
}

int CompareDouble(double a, double b)
{
    if (a > b)
        return 1;
    if (a < b)
        return -1;
    return 0;
}

Here, both a > b and a < b return false if a or b is NaN, so the CompareDouble method returns 0, so NaN is considered equal to everything... This gives the same result as in 3.5:

1,4,NaN,2,3,5,8,9,10,NaN
like image 22
Thomas Levesque Avatar answered Nov 06 '22 06:11

Thomas Levesque


I don't have the code for the .NET 3.5 runtime to verify this, but I'm thinking they fixed a bug in the default comparer for double to bring it into line with the documentation.

According to that document, Double.Compare treats NaN as equal to PositiveInfinity and NegativeInfinity, and less than any other value.

The documentation is the same for .NET 3.5 and .NET 4.0, so I have to think that this was a bug fix to make the code work as documented.

EDIT:

After reading the comments in the linked question, I have to think that the problem wasn't in Double.Compare, but rather in whatever method Array.Sort (which is what List.Sort depends on) uses to compare Double values. Somehow, I don't think it's really Double.Compare.

like image 2
Jim Mischel Avatar answered Nov 06 '22 05:11

Jim Mischel