Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

STL algorithm for equivalent ranges

Tags:

c++

stl

equals

By definition, the std::equal algorithm only takes one 'last' iterator. Many posts on stackoverflow indicate that to perform equivalence between two ranges, one must first check that the ranges have the same size in addition to calling std::equal. If random access iterators are available, this does not add any material overhead. However, it seems that without random access iterators, the first code fragment, implemented only with existing STL algorithms, will be slower than the second code fragment, which represents a custom "equivalent" algorithm (not part of STL). My question is, is fragment 2 more efficient than any algorithm coded only using existing STL algorithms? If so, why is this algorithm not part of STL?

Fragment 1:

template <typename IITR1, typename IITR2>
bool equivalent(IITR1 first1, IITR1 last1, IITR2 first2, IITR2 last2)
{
    return distance(first1, last1) == distance(first2, last2) &&
        equal( first1, last1, first2 );
}

Fragment 2:

template <typename IITR1, typename IITR2>
bool equivalent(IITR1 first1, IITR1 last1, IITR2 first2, IITR2 last2)
{
    while ( first1 != last1 && first2 != last2 ) {
       if (!(*first1 == *first2)) return false;
       ++first1; ++first2;
    }
    return first1 == last1 && first2 == last2;
}

Note: I haven't checked it, but I would be very doubtful that the compiler will optimize fragment 1 such that it produces code with the same performance produced by fragment 2.

To be complete, the following code fragment is next to useless, as it will fail if the range sizes are not equal:

template <typename IITR1, typename IITR2>
bool equivalent(IITR1 first1, IITR1 last1, IITR2 first2, IITR2 last2)
{
    return equal(first1, last1, first2) && equal(first2, last2, first1);
}
like image 987
MarkB Avatar asked Oct 06 '22 18:10

MarkB


2 Answers

It is likely that when the STL was being standardised the situation where the two ranges are non-random-access but different lengths was not considered; and looking at defect reports since it doesn't seem to have been a much demanded fix. As you've noted, it is quite easy to write your own version that gets it right.

A problem with retrofitting a fix would be deciding whether a four-parameter call is a two-range call (it1, it1_end, it2, it2_end) or a (it1, it1_end, it2, predicate) version. The techniques to distinguish (SFINAE, enable_if) have only fairly recently been available.

Note that Boost.Range has an implementation that gets it right; see http://www.boost.org/doc/libs/1_51_0/boost/range/algorithm/equal.hpp for the implementation. It also detects when the two iterator types are random-access and calls distance to short-circuit in the case where the lengths are different.

#include <boost/range/algorithm.hpp>

boost::equal(boost::make_iterator_range(first1, last1),
    boost::make_iterator_range(first2, last2));
like image 71
ecatmur Avatar answered Oct 09 '22 20:10

ecatmur


The first will only work for forward iterators, not input iterators in general.

Performance will depend on the iterator type. The first is likely to be faster for random-access iterators since the main loop of equal is simpler than the main loop of your second implementation; but is likely to be slower if distance has to iterate over the whole range.

To support input iterators, and to get the best performance in all circumstances, you'll probably need different specialisations for different iterator categories. I would start with the second, and specialise only if you find it's not sufficient.

I've no idea why it's not in the standard library, but you can't expect any library to contain every possible algorithm.

like image 38
Mike Seymour Avatar answered Oct 09 '22 19:10

Mike Seymour