Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract input iterator from std::copy and std::copy_n

Tags:

c++

iterator

stl

I trying to implement an unserialization method that take an input iterator and perform a series of chunk reads (using std::copy and std::copy_n). Something like this (just an example):

template <class InputIt>
InputIt unserialize(InputIt it)
{
  std::copy_n(it, sizeof(header_type), reinterpret_cast<char*>(&header));
  std::copy_n(it, header.payload_size, std::back_inserter(payload));
  it = optional.unserialize(it);
  return it;
}

How do I advance input iterator in this case so each following call to std::copy_n continue read from it and I can finally return it?

I want to be generic to iterator categories (especially RandomAccessIterator and InputIterator) for performance reasons and hope it is possible to use std::copy methods without need to rewrite these. Stuff like bound checking and such will be done by iterator adapter or checked before unserialization call if size is known.

What do not work nor acceptable:

  1. Using std::copy_n<InputIt&>(it, ...) may work for some categories but not for all and it's too unreliable.
  2. Using std::advance after each call leads to rereading same block again for some iterators. Not preferable and may be impossible for some sources.

UPDATE Making iterator reference adapter doesn't help as random access iterator version of copy_n return pointer to past the last element copied while input iterator version return pointer to the last element copied. So I guess own version of copy_n works best with additional iterator adapter for bound checking.

like image 900
Vladimir Talybin Avatar asked Feb 02 '18 10:02

Vladimir Talybin


2 Answers

For random access iterator this form can be used - and it is fine:

template <class InputIt, class N, class OutputIt>
InputIt copy_n_advance_input(InputIt it, N dist, OutputIt outIt)
{
    std::copy_n(it, dist, outIt);
    return std::next(it, dist);
}

Unfortunately - problems are when we want to deal with one pass input iterator - like here (got 'd' - not 'c'):

std::string s = "abcd";
std::istringstream ss{s};
auto e = copy_n_advance_input(std::istream_iterator<char>(ss), 
                             2, 
                             std::ostream_iterator<char>(std::cout, ","));
std::cout << "\n" << *e << "\n";

So it seems it is needed, as usual in STL, two forms:

template <class InputIt, class N, class OutputIt>
InputIt copy_n_advance_input_impl(InputIt it, N dist, OutputIt outIt,
                                  std::input_iterator_tag)
{
    while (dist-- > 0)
    {
        *outIt = *it;
        ++outIt;
        ++it;
    }
    return it;
}

template <class InputIt, class N, class OutputIt>
InputIt copy_n_advance_input_impl(InputIt it, N dist, OutputIt outIt, 
                                  std::random_access_iterator_tag)
{
    std::copy_n(it, dist, outIt);
    return std::next(it, dist);
}

template <class InputIt, class N, class OutputIt>
InputIt copy_n_advance_input(InputIt it, N dist, OutputIt outIt)
{
    return copy_n_advance_input_impl(it, dist, outIt, typename std::iterator_traits<InputIt>::iterator_category {});
}

Note, the proposed version for std::input_iterator_tag is not as efficient as in STL (at least for gcc) - it does extra iteration for input - this iteration is not necessary to perform copy - but it is needed to return beginning of "after copy" range (stl_algo.h):

754   template<typename _InputIterator, typename _Size, typename _OutputIterator>
755     _OutputIterator
756     __copy_n(_InputIterator __first, _Size __n,
757          _OutputIterator __result, input_iterator_tag)
758     {
759       if (__n > 0)
760     {
761       while (true)
762         {
763           *__result = *__first;
764           ++__result;
765           if (--__n > 0)
766         ++__first;
767           else
768         break;
769         }
770     }
771       return __result;
772     }

Last note - it is wiser to use, for random-access-iterators (like std::vector::iterator) the version that just calls std algorithms - because they can be even more optimized - e.g. for contiguous memory iterator over POD types - that can be just memcpy'ied. Or some specialization for std::vector<bool> or std::deque<T> exist, that uses their internal structure to perform copy in most efficient way.

like image 64
PiotrNycz Avatar answered Oct 22 '22 23:10

PiotrNycz


You could create an iterator adapter that always works through a reference to the supplied iterator. Here's the basic skeleton of such an adapter:

#include <iterator>

template<typename InputIterator>
struct IteratorReference
{
    InputIterator& it;

    IteratorReference(InputIterator& it)
        : it(it)
    {}
    // {copy, move, destructor} == default

    // iterator methods (TODO: add the rest)
    IteratorReference& operator++()
    { ++it; return this; }

    typename InputIterator::reference operator*()
    { return *it; }

    // Convert back to original iterator
    operator InputIterator() const
    { return it; }
};


template<typename InputIterator>
IteratorReference<InputIterator> make_iterator_reference(InputIterator it)
{
    return it;
}

To use it, simply create and use a wrapper in place of the original iterator:

#include <algorithm>
#include <vector>

struct Header
{
    std::size_t payload_size;
};

template <class InputIt>
InputIt unserialize(InputIt it, Header& header, std::vector<char>& payload)
{
    auto i_ref = make_iterator_reference(it);
    std::copy_n(i_ref, sizeof header, reinterpret_cast<char*>(&header));
    std::copy_n(i_ref, header.payload_size, std::back_inserter(payload));
    i_ref = optional.unserialize(i_ref);
    return i_ref;
}
like image 2
Toby Speight Avatar answered Oct 22 '22 22:10

Toby Speight