Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should sorting algorithm pass same element in the comparison function

The std::sort of libcxx (llvm version of c++ standard library) calls the comparison predicate with the same element i.e., both the arguments of comparison functor refer to the same position in the sequence to be sorted. A reduced example to illustrate the point.

$ cat a.cc
#include <algorithm>
#include <vector>
#include <cassert>

int main(int argc, char** argv) {
  int size = 100;
  std::vector<int> v(size);

  // Elements in v are unique.
  for (int i = 0; i < size; ++i)
    v[i] = i;

  std::sort(v.begin(), v.end(),
            [&](int x, int y) { assert(x != y); return x < y; });

  return 0;
}

$ clang++ -std=c++11 -stdlib=libc++ a.cc -o a.out
$ ./a.out
a.out: a.cc:14: auto main(int, char **)::(anonymous class)::operator()(int, int) const: Assertion `x != y' failed.
./go.sh: line 5: 19447 Aborted                 (core dumped) ./a.out

Works fine with libstdc++.

$ clang++ -std=c++11 -stdlib=libstdc++ a.cc -o a.out
$ ./a.out

Is this okay to call the comparison function with same element. Isn't this redundant.

like image 975
A. K. Avatar asked Aug 16 '16 04:08

A. K.


People also ask

How does the compare function work in sort?

The purpose of the compare function is to define an alternative sort order. When the sort() function compares two values, it sends the values to the compare function, and sorts the values according to the returned (negative, zero, positive) value. If the result is negative a is sorted before b .

How many comparisons and swap does insertion sort requires to sort five elements?

4.

Which sorting algorithm will take least time when all elements of input array are identical consider Typical implementations of sorting algorithms?

Que – 1. Which sorting algorithm will take the least time when all elements of input array are identical? Consider typical implementations of sorting algorithms. Solution: As discussed, insertion sort will have the complexity of n when the input array is already sorted.


Video Answer


2 Answers

I can speak with some authority on this question as I'm the one who wrote this code.

Here is the comparison which is asserting in your example:

https://github.com/llvm-mirror/libcxx/blob/master/include/algorithm#L3994-L3995

As the link is likely to go stale (point to the wrong line) as time goes on, I'll also quote the code here:

            // __m still guards upward moving __i
            while (__comp(*__i, *__m))
                ++__i;

This is known as an "unguarded" loop because there is no check for the iterator __i running off the end of the sequence as it is incremented. The reason this works is because an invariant of this algorithm is that at this point it is known that __i <= __m (which is also in a comment 3 lines above this quote).

If you look at the code above this quote, you will see these comments:

    // The search going up is known to be guarded but the search coming down isn't.
    // Prime the downward search with a guard.

So before we get to this point, a guarded search down the sequence is done. That is, this test:

if (__i == --__j)

After this test finds the lower guard, the algorithm then jumps to the unguarded loops which have only one test per iteration, whereas otherwise there would be two tests per iteration (a test on the iterator and a test on the dereferenced value of the iterator).

The use of "unguarded loops" is the cause of an element being compared with itself. During development I measured that the cost of the one extra comparison in the loop was a better deal than including two comparisons per iteration in the loop.

Of course this is an engineering tradeoff. If the comparison function turned out to be horribly expensive compared to the cost of comparing the iterators themselves, one could come to a different conclusion.

This answer is completely consistent with rici's answer which is also correct (and I've upvoted it). I added my voice as I could drop the "presumably" and point to specific lines of code in the algorithm.

like image 51
Howard Hinnant Avatar answered Nov 15 '22 19:11

Howard Hinnant


Presumably, in the opinion of the standard-library authors, it is faster to do a test which is guaranteed to return false than to constantly check for equal indices as well as comparing elements. This might come about because the pivot element is used as a loop sentinel.

It is certainly permitted for the comparison function to be called in this way, and the assert in your code is not permitted.

like image 43
rici Avatar answered Nov 15 '22 18:11

rici