Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is std::sort() faster than std::make_heap()?

I have 13721057 elements in my std::vector<Sequence>. I need to sort this vector and grab the first 25 elements. I thought, since you can build a heap in O(N) it must be faster to pop 25 elements (each being O(logN)) than sorting the whole vector in O(NlogN).

However, when I time the code:

clock_t tStart = clock();
sort(mostFrequent.begin(), mostFrequent.end(), greater<Sequence>());
printf("Time taken: %.2fs\n", (double)(clock() - tStart)/CLOCKS_PER_SEC);

vs.

clock_t tStart = clock();
make_heap(mostFrequent.begin(), mostFrequent.end());
printf("Time taken: %.2fs\n", (double)(clock() - tStart)/CLOCKS_PER_SEC);

It appears to be much faster to sort the whole vector. Why is this?

like image 823
Murat Ayan Avatar asked Nov 30 '22 09:11

Murat Ayan


2 Answers

This is not a full answer but to get the first 25 elements out of 13721057 you better use partial_sort.

If you only need the 25th element, then nth_element.

As a side note. For getting the first elements less than X in sorted order, I would do auto mid = std::partition with a lambda, and then std::sort(begin,mid). There may be a better way.

like image 64
Johan Lundberg Avatar answered Dec 04 '22 13:12

Johan Lundberg


EDIT: As suggested in a comment I also tried with a pre-sorted input and in that case I did manage to get sort faster than make_heap for my "expensive to copy" type, but only by a small margin around 5-10%.

No matter what I try, I am unable to reproduce your results on either Solaris or Linux (gcc 4.4). make_heap has always come out on the order of 1/3rd the time spent.

  • No optimization vs -O3 only changes total time, not relative order.
  • I used your exact number of items.
  • First tried sorting int then a larger "expensive to copy" class.
  • Guessed what includes you were using.
  • Moved timing calls outside the printf to make sure they were always ordered properly.

I assume that the actual reason for this discrepancy is that either your < and > operators aren't the same complexity or that copying your object is somehow expensive relative to comparing it in a way my test was unable to duplicate.

like image 37
Mark B Avatar answered Dec 04 '22 15:12

Mark B