Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Would std::count_if be faster without an if?

Here's the gcc std::count_if code

template<typename _InputIterator, typename _Predicate>
  typename iterator_traits<_InputIterator>::difference_type
  count_if(_InputIterator __first, _InputIterator __last, _Predicate __pred)
 {
  [snip]
  typename iterator_traits<_InputIterator>::difference_type __n = 0;
  for (; __first != __last; ++__first)
    if (__pred(*__first))
      ++__n;
  return __n;
}

My question: would it work better (i.e., faster) to use

__n += __pred(*__first); // instead of the if statement

This version always does an add, but doesn't do a branch.

like image 435
Andrew Lazarus Avatar asked Oct 07 '14 21:10

Andrew Lazarus


2 Answers

The replacement you gave is not equivalent, because there are far fewer restrictions on a predicate than you think:

  • Anything which can be used in a conditional context (can be contextually converted to bool), is a valid return-type for the predicate (an explicit conversion to bool is enough).
  • That return-type can react funny to being added to the iterators difference-type.

25 Algorithms library [algorithms]

25.1 General [algorithms.general]

8 The Predicate parameter is used whenever an algorithm expects a function object (20.9) that, when applied to the result of dereferencing the corresponding iterator, returns a value testable as true. In other words, if an algorithm takes Predicate pred as its argument and first as its iterator argument, it should work correctly in the construct pred(*first) contextually converted to bool (Clause 4). The function object pred shall not apply any non-constant function through the dereferenced iterator.

The most likely return giving your replacement indigestion would be a standard integer-type, and a value neither 0 nor 1.

Also, keep in mind that compilers can actually optimize really good nowadays (and especially C++ ones need to, with all that template-stuff layered deep).

like image 150
Deduplicator Avatar answered Nov 15 '22 21:11

Deduplicator


So, first, your suggested code is different. So let's look at two equivalent codes:

template<typename _InputIterator, typename _Predicate>
typename iterator_traits<_InputIterator>::difference_type
count_if(_InputIterator __first, _InputIterator __last, _Predicate __pred) {
    typename iterator_traits<_InputIterator>::difference_type __n = 0;
    for (; __first != __last; ++__first)
        if (__pred(*__first))
            ++__n;
    return __n;
}

And:

template<typename _InputIterator, typename _Predicate>
typename iterator_traits<_InputIterator>::difference_type
count_if(_InputIterator __first, _InputIterator __last, _Predicate __pred) {
    typename iterator_traits<_InputIterator>::difference_type __n = 0;
    for (; __first != __last; ++__first)
        __n += (bool) __pred(*__first);
    return __n;
}

Then, we can compile this with our compiler and look at the assembly. Under one compiler that I tried (clang on os x), these produced identical code.

Perhaps your compiler will also produce identical code, or perhaps it might produce different code.

like image 30
Bill Lynch Avatar answered Nov 15 '22 20:11

Bill Lynch