Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is my C# quicksort implementation significantly slower than List<T>.Sort

I've implemented a version of quicksort in C#, and performed some quick benchmarks to compare against the C# List<T>.Sort. I have found my implementation to be significantly slower than the library version. I am trying to understand why. Here are some rough benchmark numbers. For input I used a list of integers that were generated randomly (uniform) with few duplicate elements. Note that all benchmark times are averages of several runs.

100,000 elements
My code         0.096 seconds
List<T>.Sort    0.011 seconds

1,000,000 elements
My code         1.10 seconds
List<T>.Sort    0.14 seconds

My code is below. Here is a list of optimizations I've tried and their results:

  • In-line - I've tried to in-line my Swap and ChoosePivotIndex functions I see around a 10% improvement.
  • Pivot selection - I know that I'm using a naive pivot selection method. I've tried using the median of three random elements as well. This does not yield much improvement. I'm guessing this is because the input data I'm using to benchmark is uniformly random.
  • Parallelism - I've tried making the recursive Partition calls in parallel. This seems to actually increase execution time.
  • Special casing small input - I've tried switching to insertion sort for small input (as List<T>.Sort claims to do). That yields around a 20% improvement.

With a combination of these optimizations, I've been able to get my code down to

100,000 elements -   0.062 seconds
1,000,000 elements - 0.740 seconds 

This is still significantly slower than the library Sort. Is there any obvious in my code that would explain the performance gap, or is the remaining 70-80 percent gap from more small tweaks? Note that the code below is my base unoptimized verison

public class Quicksorter<T> where T : IComparable<T>
{
    protected Random random;
    public Quicksorter()
    {
        random = new Random();
    }

    public void Sort(IList<T> items)
    {
        Partition(items, 0, items.Count-1);
    }

    private void Partition(IList<T> items, int startIndex, int endIndex)
    {
        if (startIndex >= endIndex)
            return;

        int pivotIndex = ChoosePivotIndex(items, startIndex, endIndex);
        T pivot = items[pivotIndex];

        // swap pivot and first element
        Swap(items, startIndex, pivotIndex);

        int partitionIndex = startIndex + 1;
        for (int frontierIndex = partitionIndex; frontierIndex <= endIndex; frontierIndex++)
        {
            if (items[frontierIndex].CompareTo(pivot) < 0)
            {
                Swap(items, frontierIndex, partitionIndex);
                partitionIndex++;
            }
        }
        // put pivot back
        items[startIndex] = items[partitionIndex-1];
        items[partitionIndex - 1] = pivot;

        // recursively sort left half
        Partition(items, startIndex, partitionIndex - 2);
        // recursively sort right half
        Partition(items, partitionIndex, endIndex);
    }

    protected virtual int ChoosePivotIndex(IList<T> items, int startIndex, int endIndex)
    {
        return random.Next(startIndex, endIndex);
    }

    private void Swap(IList<T> items, int aIndex, int bIndex)
    {
        T temp = items[aIndex];
        items[aIndex] = items[bIndex];
        items[bIndex] = temp;
    }
}
like image 989
mattnedrich Avatar asked Feb 17 '14 00:02

mattnedrich


1 Answers

Because the .NET Framework has special-case sorting methods for integers, strings, and other built-in types. They don't incur the cost of calling delegates for comparisons, etc. If you're comparing your sort using built-in types, the library sort is typically going to be much, much faster.

like image 137
Jim Mischel Avatar answered Sep 18 '22 11:09

Jim Mischel