Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how was Array.Sort implemented in .NET?

I am using structures in my programming and I sort the structure according to a value in the structure using IComparer.

How did Microsoft implement the Array.Sort() method? Is there any documentation (references) for this? Is it the same for all types of Sort() in Visual Basic?

This is a simple example for what I want.

Dim MyArray(6) As Integer
    MyArray(0) = 1
    MyArray(1) = 45
    MyArray(2) = 45
   ' Some Code.....
    '.........
    '..........
    MyArray(3) = 1
    MyArray(4) = 10
    ' Some Code.....
    '.........
    '..........
    MyArray(5) = 1
    MyArray(6) = 57

    Array.Sort(MyArray)

Array.Sort() will sort this array as: (1 1 1 10 45 45 57)

How does number 1 get sorted? Is it bringing to the end the first one or keeps the old one in the same index?

In my original example (before sorting), MyArray(0) = 1 and after sorting MyArray(0) = 1.

Is this the same original 1 or this another 1 (the newest one added to the array) moved to that position?

In case the MyArray(0) = 1 after sorting should be MyArray(5) = 1 before sorting.

like image 677
Aharoun Baalan Avatar asked Nov 28 '22 08:11

Aharoun Baalan


2 Answers

It uses the Quicksort algorithm, which is not stable when implemented efficiently (in place). Meaning that it doesn't guarantee that values which are equal retain their prior relative position after sorting.

For example, if you have a bunch of points:

Point[] points = new Point[]
{
   new Point(0, 1),
   new Point(0, 2),
   new Point(0, 3),
   new Point(1, 1),
   new Point(1, 2),
   new Point(1, 3)
};

And you sort these points by x-coordinate only, using this comparer:

private int CompareByX(Point a, Point b)
{
    return a.X - b.X;
}

It will only guarantee that the points are sorted by their x-coordinate, meaning you could easily end up with a mixed up order (when looking at the y-coordinate):

Point(0, 3)
Point(0, 2)
Point(0, 1)
Point(1, 3)
Point(1, 2)
Point(1, 1)

[Edit]

This doesn't mean that the sorting algorithm is non-deterministic (random). For same input data, you will get same output data on each run. You can also predict the actual way it will be reorganized if you examine the algorithm precisely, but it is unnecessary. It is sufficient just to know that this happens when using the sort routine.

Here is a working example for your problem, try changing the test data sizes (first line in Main) and watch how the array gets reorganized on each run:

class Program
{
    static void Main()
    {
        Point[] points = CreateTestData(1, 4).ToArray();
        DisplayItems("Before", points);
        Array.Sort(points, CompareByX);
        DisplayItems("After", points);
        Console.ReadLine();
    }

    private class Point
    {
        public int X { get; private set; }
        public int Y { get; private set; }
        public override string ToString()
        { return string.Format("({0},{1})", X, Y); }
        public Point(int x, int y)
        { X = x; Y = y; }
    }

    private static int CompareByX(Point a, Point b)
    { return a.X - b.X; }

    private static IEnumerable<Point> CreateTestData(int maxX, int maxY)
    {
        for (int x = 0; x <= 1; x++)
            for (int y = 0; y <= 4; y++)
                yield return new Point(x, y);
    }

    private static void DisplayItems(string msg, Point[] points)
    {
        Console.WriteLine(msg);
        foreach (Point p in points)
            Console.WriteLine(p.ToString());
        Console.WriteLine();
    }
}

Of course, if you extend the comparer delegate to include the Y coordinate, you will not have this problem:

    private static int CompareByX(Point a, Point b)
    {
         if (a.X == b.X) 
            return a.Y - b.Y;
         else
            return a.X - b.X;
    }
like image 80
Groo Avatar answered Dec 21 '22 07:12

Groo


Array.Sort is an unstable sort, so the order of elements which are the same is undefined and not conserved. The article on Array.Sort in MSDN states:

This method uses the QuickSort algorithm. This implementation performs an unstable sort; that is, if two elements are equal, their order might not be preserved. In contrast, a stable sort preserves the order of elements that are equal.

LINQ's OrderBy methods on the other hand are stable. The article on OrderBy in the MSDN states:

This method performs a stable sort; that is, if the keys of two elements are equal, the order of the elements is preserved. In contrast, an unstable sort does not preserve the order of elements that have the same key.

like image 42
CodesInChaos Avatar answered Dec 21 '22 07:12

CodesInChaos