Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it better to call vector::reserve() before calling vector::assign()?

Tags:

c++

stl

I understand it's a good practice to use "reserve" to avoid unnecessary reallocations (Item 14 of Effective STL):

std::vector<int> v1;
v1.reserve(1000);

for (int i = 0; i < 1000; i++)
    v1.push_back(i);

Does the same rule apply when you call assign?

std::vector<int> v2;
//v2.reserve(v1.size()); // Better to do this?
v2.assign(v1.begin(), v1.end());
like image 351
jpen Avatar asked Jul 01 '12 14:07

jpen


2 Answers

In case when v1 is std::vector you don't really need it, as the compiler/stl knows how many items are going to be there in v2 (and will reserve the needed amount itself before copying the actual data).

For the generic case, however, it may make sense to reserve the needed amount in advance, if the input container (v1) doesn't know how many items are there, and you have the number at hand.

like image 66
Vlad Avatar answered Nov 07 '22 02:11

Vlad


Whether to call reserve or not depends on:

  • the iterator type: for input iterators the implementation cannot guess the size
  • the quality of the library: it may not specialize for "better" iterators
  • whether performance is worth the decrease in maintainability

Let us take the 3 points in order.

1) Iterator Type

The assign method takes two iterators that must at least conform to the InputIterator model. The problem is that this model represents pure sources (like the bytes that come from the network): you can consume something twice out of it. As a result, given two InputIterator it is not possible to compute the distance between them without also extracting the data (unless you don't want the data at all, but that's not what assign is about), so you cannot "reserve" first.

This is illustrated by the fact that std::distance requires at least FowardIterator.

2) The quality of implementation

I do not think that the Standard actually mandates for "better" iterators (which at least model ForwardIterator) that the implementation of assign walk the range twice. In a computation limited by the memory-bandwidth (imagine reading that information on a tape, with a very slow rewind time), this would actually be more costly.

However, many implementations (such as libc++, see below) will specialize assign so that in the presence of ForwardIterator it calls std::distance first to reserve the right amount of memory if necessary.

Note: the same applies to mass insertions by the way.

3) Maintenance burden

I would note that despite the possible gain, you are (perhaps unknowingly) duplicating information here.

size_t size = std::distance(begin, end);

if (begin != end) ++begin; // new line

v.reserve(size);
v.assign(begin, end);

See how the appearance of the new line suddenly makes the code slightly incorrect ? Not that it will not work, but the purported optimization is not so correct any longer: you reserve too much now!

Personally I would trust in my Standard library implementation to do right thing. The guys writing them have just much more experience than I do.

And if really it is an identified bottleneck in your application, you can always try your way. Just write a reserve_and_assign method to make it obvious what it does and measure if it's better.


For reference, here is libc++ implementation, taken here:

template <class _Tp, class _Allocator>
template <class _InputIterator>
typename enable_if
<
     __is_input_iterator  <_InputIterator>::value &&
    !__is_forward_iterator<_InputIterator>::value,
    void
>::type
vector<_Tp, _Allocator>::assign(_InputIterator __first, _InputIterator __last)
{
    clear();
    for (; __first != __last; ++__first)
        push_back(*__first);
}

template <class _Tp, class _Allocator>
template <class _ForwardIterator>
typename enable_if
<
    __is_forward_iterator<_ForwardIterator>::value,
    void
>::type
vector<_Tp, _Allocator>::assign(_ForwardIterator __first, _ForwardIterator __last)
{
    typename iterator_traits<_ForwardIterator>::difference_type __new_size = _VSTD::distance(__first, __last);
    if (static_cast<size_type>(__new_size) <= capacity())
    {
        _ForwardIterator __mid = __last;
        bool __growing = false;
        if (static_cast<size_type>(__new_size) > size())
        {
            __growing = true;
            __mid =  __first;
            _VSTD::advance(__mid, size());
        }
        pointer __m = _VSTD::copy(__first, __mid, this->__begin_);
        if (__growing)
            __construct_at_end(__mid, __last);
        else
            this->__destruct_at_end(__m);
    }
    else
    {
        deallocate();
        allocate(__recommend(static_cast<size_type>(__new_size)));
        __construct_at_end(__first, __last);
    }
}
like image 23
Matthieu M. Avatar answered Nov 07 '22 01:11

Matthieu M.