Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

for loop vs std::for_each with lambda [duplicate]

Tags:

c++

c++11

lambda

Let's consider a template function written in C++11 which iterates over a container. Please exclude from consideration the range loop syntax because it is not yet supported by the compiler I'm working with.

template <typename Container>
void DoSomething(const Container& i_container)
  {
  // Option #1
  for (auto it = std::begin(i_container); it != std::end(i_container); ++it)
    {
    // do something with *it
    }

  // Option #2
  std::for_each(std::begin(i_container), std::end(i_container), 
    [] (typename Container::const_reference element)
    {
    // do something with element
    });
  }

What are pros/cons of for loop vs std::for_each in terms of:

a) performance? (I don't expect any difference)

b) readability and maintainability?

Here I see many disadvantages of for_each. It wouldn't accept a c-style array while the loop would. The declaration of the lambda formal parameter is so verbose, not possible to use auto there. It is not possible to break out of for_each.

In pre- C++11 days arguments against for were a need of specifying the type for the iterator (doesn't hold any more) and an easy possibility of mistyping the loop condition (I've never done such mistake in 10 years).

As a conclusion, my thoughts about for_each contradict the common opinion. What am I missing here?

like image 543
Andrey Avatar asked Aug 14 '12 16:08

Andrey


3 Answers

I think there are some other differences not yet covered by the answers so far.

  1. a for_each can accept any appropriate callable object, allowing one to 'recycle' the loop body for different for loops. For example (pseudo code)

    for( range_1 ) { lengthy_loop_body }    // many lines of code
    for( range_2 ) { lengthy_loop_body }    // the same many lines of code again
    

    becomes

    auto loop_body = some_lambda;           // many lines of code here only
    std::for_each( range_1 , loop_body );   // a single line of code
    std::for_each( range_2 , loop_body );   // another single line of code
    

    thus avoiding duplication and simplifying code maintenance. (Of course, in a funny mix of styles one could also use a similar approach with the for loop.)

  2. another difference regards breaking out of the loop (with break or return in the for loop). As far as I know, in an for_each loop this can only be done by throwing an exception. For example

    for( range )
    {
      some code;
      if(condition_1) return x; // or break
      more code;
      if(condition_2) continue;
      yet more code;
    }
    

    becomes

    try {
      std::for_each( range , [] (const_reference x)
                    {
                      some code;
                      if(condition_1) throw x;
                      more code;
                      if(condition_2) return;
                      yet more code;
                    } );
    } catch(const_reference r) { return r; }
    

    with the same effects regarding calling of destructors for objects with scope of the loop body and the function body (around the loop).

  3. the main benefit of for_each is, IMHO, that one can overload it for certain container types, when plain iteration is not as efficient. For example, consider a container that holds a linked list of data blocks, each block containing a contiguous array of elements, similar to (omitting irrelevant code)

    namespace my {
      template<typename data_type, unsigned block_size>
      struct Container
      {
        struct block
        {
          const block*NEXT;
          data_type DATA[block_size];
          block() : NEXT(0) {}
        } *HEAD;
      };
    }
    

    then an appropriate forward iterator for this type would require to check for the end of block at each increment and the comparison operator needs to compare both the block pointer and the index within each block (omitting irrelevant code):

    namespace my {
      template<typename data_type, unsigned block_size>
      struct Container
      {
        struct iterator
        {
          const block*B;
          unsigned I;
          iterator() = default;
          iterator&operator=(iterator const&) = default;
          iterator(const block*b, unsigned i) : B(b), I(i) {}
          iterator& operator++()
          {
            if(++I==block_size) { B=B->NEXT; I=0; }    // one comparison and branch
            return*this;
          }
          bool operator==(const iterator&i) const
          { return B==i.B && I==i.I; }                 // one or two comparisons
          bool operator!=(const iterator&i) const
          { return B!=i.B || I!=i.I; }                 // one or two comparisons
          const data_type& operator*() const
          { return B->DATA[I]; }
        };
        iterator begin() const
        { return iterator(HEAD,0); }
        iterator end() const
        { return iterator(0,0); }
      };
    }
    

    this type of iterator works correctly with for and for_each, for example

    my::Container<int,5> C;
    for(auto i=C.begin();
        i!=C.end();              // one or two comparisons here
        ++i)                     // one comparison here and a branch
      f(*i);
    

    but requires two to three comparisons per iteration as well as a branch. A more efficient way is to overload the for_each() function to loop on the block pointer and index separately:

    namespace my {
      template<typename data_type, int block_size, typename FuncOfDataType>
      FuncOfDataType&&
      for_each(typename my::Container<data_type,block_size>::iterator i,
               typename my::Container<data_type,block_size>::iterator const&e,
               FuncOfDataType f)
      {
        for(; i.B != e.B; i.B++,i.I=0)
          for(; i.I != block_size; i.I++)
            f(*i);
        for(; i.I != e.I; i.I++)
          f(*i);
        return std::move(f);
      }
    }
    using my::for_each;     // ensures that the appropriate
    using std::for_each;    // version of for_each() is used
    

    which requires only one comparison for most iterations and has no branches (note that branches can have a nasty impact on performance). Note that we don't need to define this in namespace std (which might be illegal), but can ensure that the correct version is used by appropriate using directives. This is equivalent to using std::swap; when specialising swap() for certain user-defined types.

like image 95
Walter Avatar answered Oct 20 '22 20:10

Walter


Regarding perfomance, your for loop calls std::end repeatedly, while std::for_each will not. This might or might not result in a performance difference depending on the container used.

like image 7
eq- Avatar answered Oct 20 '22 20:10

eq-


  • The std::for_each version will visit each element exactly once. Somebody reading the code can know that as soon as they see std::for_each, as there's nothing that can be done in the lambda to mess with the iterator. In the traditional for loop, you have to study the body of the loop for unusual control flow (continue, break, return) and dinking with the iterator (e.g., in this case, skip the next element with ++it).

  • You can trivially change the algorithm in the lambda solution. For example, you could make an algorithm that visits every nth element. In many cases, you didn't really want a for loop anyway, but a different algorithm like copy_if. Using an algorithm+lambda, is often more amenable to change and is a bit more concise.

  • On the flip side, programmers are much more used to traditional for loops, so they may find algorithm+lambda to be harder to read.

like image 4
Adrian McCarthy Avatar answered Oct 20 '22 22:10

Adrian McCarthy