Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a sorting routine faster than qsort?

Tags:

c++

sorting

This is not an algorithmic question, but an implementation question.

I have a data structure that looks like:

struct MyStruct {
   float val;
   float val2;
   int idx;
}

I go through an array of about 40 million elements, and assign the 'val' fields to be the element, and the 'idx' field to be the index.

I'm then calling:

MyStruct* theElements = new MyStruct[totalNum];
qsort(theElements, totalNum, sizeof(MyStruct), ValOrdering);

and then, once I fill in val2, reversing the procedure with

qsort(theElements, totalNum, sizeof(MyStruct), IndexOrdering);

where

static int ValOrdering(const void* const v1, const void* const v2)
{
  if (((struct MyStruct*) v1)->val < ((struct MyStruct*) v2)->val)
    return -1;

  if (((struct MyStruct*) v1)->val> ((struct MyStruct*) v2)->val)
    return 1;

  return 0;
}

and

static int IndexOrdering(const void* const v1, const void* const v2)
{
  return ((struct MyStruct*) v1)->idx- ((struct MyStruct*) v2)->idx;
}

This setup takes 4 seconds to perform both sorts. 4 seconds seems like a long time for a sort of 40 million elements to take on a 3Ghz i5 processor; is there a faster approach? I'm using vs2010 with the Intel Compiler (that has sorts, but not over structs like this that I can see).

Update: Using std::sort shaves about 0.4 seconds off of the runtime, called like:

std::sort(theElements, theElements + totalPixels, ValOrdering);
std::sort(theElements, theElements + totalPixels, IndexOrdering);

and

bool GradientOrdering(const MyStruct& i, const MyStruct& j){
    return i.val< j.val;
}
bool IndexOrdering(const MyStruct& i, const MyStruct& j){
    return i.idx< j.idx;
}

adding the 'inline' keyword to the predicates does not seem to matter. Since i have, and the spec allows for, a quad core machine, I'll check some kind of multithreaded sort next.

Update 2: Following @SirGeorge and @stark, I took a look at a single sort done via pointer redirects:

bool GradientOrdering(MyStruct* i, MyStruct* j){
    return i->val< j->val;
}
bool IndexOrdering(MyStruct* i, MyStruct* j){
    return i->idx< j->idx;
} 

Even though there is just a single sorting call (to the GradientOrdering routine), the resulting algorithm takes 5 seconds, 1 second longer than the qsort approach. It looks like std::sort is winning for now.

Update 3: Looks like Intel's tbb::parallel_sort is the winner, taking the runtime of a single sort down to 0.5s on my system (so, 1.0s for both, which means that it's scaling pretty well from the original 4.0s for both). I tried to go with a parallel fanciness proposed by Microsoft here, but since I'm already using tbb and the syntax for parallel_sort is identical to the syntax for std::sort, I could use my earlier std::sort comparators to get everything finished.

I also used @gbulmer's suggestion (really, hitting-me-over-the-head realization) that I already have the original indeces, so instead of doing a second sort, I just need to assign a second array with the indeces from the first back into sorted order. I can get away with this memory usage because I'm only deploying on 64bit machines with at least 4 gb of RAM (good to have these specs worked out ahead of time); without that knowledge, a second sort would be necessary.

@gbulmer's suggestion gives the most speedup, but the original question asked about fastest sort. std::sort is the fastest single-threaded, parallel_sort is the fastest multithreaded, but no one gave that answer, so I'm giving @gbulmer the check.

like image 607
mmr Avatar asked Mar 23 '12 21:03

mmr


1 Answers

Generally speaking, C++'s std::sort located in algorithm will beat qsort, because it allows the compiler to optimize away the indirect call over the function pointer, and makes it easier for the compiler to perform inlining. However, this is only going to be a constant factor speedup; qsort already uses a very fast sorting algorithm.

Do note that if you decide to switch over to std::sort, that your comparison functor will have to change. std::sort accepts a simple less than comparison returning bool, while std::qsort accepts a functor returning -1, 0, or 1 depending on the input.

like image 89
Billy ONeal Avatar answered Oct 10 '22 06:10

Billy ONeal