Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"end()" iterator for back inserters?

Tags:

c++

iterator

stl

For iterators such as those returned from std::back_inserter(), is there something that can be used as an "end" iterator?

This seems a little nonsensical at first, but I have an API which is:

template<typename InputIterator, typename OutputIterator>
void foo(
    InputIterator input_begin,
    InputIterator input_end,
    OutputIterator output_begin,
    OutputIterator output_end
);

foo performs some operation on the input sequence, generating an output sequence. (Who's length is known to foo but may or may not be equal to the input sequence's length.)

The taking of the output_end parameter is the odd part: std::copy doesn't do this, for example, and assumes you're not going to pass it garbage. foo does it to provide range checking: if you pass a range too small, it throws an exception, in the name of defensive programming. (Instead of potentially overwriting random bits in memory.)

Now, say I want to pass foo a back inserter, specifically one from a std::vector which has no limit outside of memory constraints. I still need a "end" iterator - in this case, something that will never compare equal. (Or, if I had a std::vector but with a restriction on length, perhaps it might sometimes compare equal?)

How do I go about doing this? I do have the ability to change foo's API - is it better to not check the range, and instead provide an alternate means to get the required output range? (Which would be needed anyways for raw arrays, but not required for back inserters into a vector.) This would seem less robust, but I'm struggling to make the "robust" (above) work.

like image 768
Thanatos Avatar asked Jan 01 '11 07:01

Thanatos


2 Answers

As I mentioned in a comment to j_random_hacker's answer, if you modify the algorithm such that the iterators passed to it can be of different types, for example,

template <typename InIt1, InIt2, OutIt1, OutIt2>
void foo(InIt1 in_begin, InIt2 in_end, OutIt1 out_begin, OutIt2 out_end) { }

then you can very easily write a back_inserter_end class to test against:

class back_inserter_end 
    : std::iterator<std::output_iterator_tag, void>
{ };

bool operator==(back_inserter_end, back_inserter_end) { return true;  }
bool operator!=(back_inserter_end, back_inserter_end) { return false; }

template <typename Container>
bool operator==(const back_insert_iterator<Container>&, back_inserter_end) { 
     return false; 
}

template <typename Container>
bool operator==(back_inserter_end, const back_insert_iterator<Container>&) { 
    return false; 
}

template <typename Container>
bool operator!=(const back_insert_iterator<Container>&, back_inserter_end) { 
    return true; 
}

template <typename Container>
bool operator!=(back_inserter_end, const back_insert_iterator<Container>&) { 
    return true; 
}

You can then call your "safe" algorithm:

foo(it, it, std::back_inserter(v), back_inserter_end());

Inside of your "safe" algorithm, you can test whether out_it == out_end before using out_it; since out_it is a back_insert_iterator, this test will always return false (which is the desired behavior).

like image 114
James McNellis Avatar answered Oct 16 '22 12:10

James McNellis


You can avoid changing foo()'s API by performing the safety check as you go, by checking that curr_output_iter != output_end before each element is written out (see below), instead of once at the start with a distance() check as James McNellis suggests.

Doing this will require writing your own "iterator adaptor" to provide an "enhanced" output iterator that can test for whether it's at the end of its valid range. Then you would rightfully change the template typenames from OutputIterator to SafeOutputIterator -- even though this just serves as documentation because it's something that can't be enforced in C++. We distinguish between "bounded" and "unbounded" iterator pairs: for the latter, no two iterators will ever compare equal, and in fact the 2nd iterator will never be needed for anything besides its type.

/* Metafunction for determining whether the range has a fixed endpoint.
 * Assume all iterators are bounded except for OutputIterators. */
template <typename Tag>
struct helper {
    static bool value = true;
};

template <>
struct helper<output_iterator_tag> {
    static bool value = false;
};

template <typename It>
struct is_bounded {
    static bool value = helper<typename iterator_traits<It>::iterator_category>::value;
};

/* Wraps an output iterator. */
template<typename It, bool BOUNDED = is_bounded<It>::value>
class safe_output_iterator {
    typedef typename iterator_traits<It>::value_type value_type;

public:
    explicit safe_output_iterator(It iter, size_t limit = 0) : iter_(iter), limit_(limit) {}

    safe_output_iterator& operator++() {  /* Preinc */
        ++iter_;
        return *this;
    }

    safe_output_iterator operator++(int) {  /* Postinc */
        safe_output_iterator temp(*this);
        ++iter_;
        return temp;
    }

    /* Trick: Deferencing returns the same object, so that operator=() works */
    safe_output_iterator& operator*() {
        return *this;
    }

    /* Will be called despite "dereferencing" this iterator object */
    safe_output_iterator& operator=() {
        /* Insert check for limit == 0 here if you want */
        --limit_;
        return *_iter;
    }

    /* Addition to the OutputIterator concept. */
    bool operator==(safe_output_iterator const& x) {
        return BOUNDED && limit_ == x.limit_;
    }

    bool operator!=(safe_output_iterator const& x) {
        return !(*this == x);
    }

private:
    It iter_;
    size_t limit_;
};

/* Helper function templates to avoid typing out long typenames */

/* Handle iterators */
template <typename It>
safe_output_iterator<It> soi_begin(It it, size_t limit = 0) {
    return safe_output_iterator<It>(it, limit);
}

template <typename It>
safe_output_iterator<It> soi_end(It it, size_t limit = 0) {
    return safe_output_iterator<It>(it, limit);
}

/* Handle STL containers like vector and list */
template <typename C>
safe_output_iterator<It> soi_begin_cont(C cont) {
    return safe_output_iterator<typename C::iterator>(cont.begin(), cont.size());
}

template <typename C>
safe_output_iterator<It> soi_end_cont(C cont) {
    /* Actually the iterator supplied (here cont.end()) is unimportant... */
    return safe_output_iterator<typename C::iterator>(cont.end());
}

You can now write code like:

vector<int> u(10, 42), v(ENOUGH_SPACE), w, x(ENOUGH_SPACE);

// Explicit iterator pair; bounded
foo(u.begin(), u.end(), soi_begin(v.begin(), ENOUGH_SPACE), soi_end(v));

// Explicit iterator pair; unbounded (2nd iterator ignored, but we need the type)
foo(u.begin(), u.end(), soi_begin(w.back_inserter()), soi_end(w.back_inserter()));

// Use whole container
foo(u.begin(), u.end(), soi_begin_cont(x), soi_end_cont(x));

You can either have foo() check curr_output_iter != output_end before each write through *curr_output_iter, or alternatively insert the check in safe_output_iterator::operator=(). The latter seems preferable as you then cannot forget to perform the check whenever a pair of safe_output_iterators are passed to any arbitrary algorithm (presumably this is similar to how debug versions of the STL work). You could also write overloads of soi_begin_cont() and soi_end_cont() for fixed-size arrays.

All this would be much less cumbersome if ranges were used by STL algorithms instead of iterator pairs -- that way a single function template could return an appropriate range spanning an entire array, for example.

like image 24
j_random_hacker Avatar answered Oct 16 '22 14:10

j_random_hacker