Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is better: reserve vector capacity, preallocate to size or push back in loop?

Tags:

c++

stdvector

I have a function that takes a pointer to char array and segment size as input arguments and calls another function that requires a std::array<std::string>. The idea is that the input char array is "sectioned" into equal parts, and string array formed.

The input char array format is several smaller arrays (or strings) of determined size, concatenated togeather. These are not assumed zero-terminated, although they might be. Examples for segment size 5 and number of elements 10:

char k[] = "1234\0001234\0001234\0001234\0001234\0001234\0001234\0001234\0001234\0001234\000";
char m[] = "1234\00067890987654321\000234567809876\0005432\000\000\0003456789098";
char n[] = "12345678909876543211234567890987654321123456789098";

Length of all char arrays is 51 (segment * elements + 1). My goal is to make the function use resources efficiently, most importantly execution time.

Since there are many ways to skin a cat, I have two (or three) ways to tackle this, and the question is, which is "better"? By that I mean faster and less resource-wasteful. I am not a professional, so be patient with me.

Here, values is preallocated and then each string assigned a value.

void myfnc_1(void *a_src, uint32_t a_segment) {
    // a_segment = 5 for example
    size_t nSize = GetSize(); // another method, gets 10
    std::vector<std::string> values(nSize);
    char* v = a_src; // take k, n or m for example

    for (size_t i = 0; i < nSize; ++i) {
        values.at(i).assign(v, a_segment);
        v += a_segment;
    }
}

Here, the vector is not allocated, but each iteration a new string is added.

void myfnc_1(void *a_src, uint32_t a_segment) {
    size_t nSize = GetSize();
    std::vector<std::string> values();
    char* v = a_src;

    for (size_t i = 0; i < nSize; ++i) {
        values.push_back("");
        values.back().assign(v, a_segment);
        v += a_segment;
    }
}

There might be a third way, that's better. I'm not so experienced with vectors so I don't know exactly. Do the segment length and number of elements make a difference if they are usually large (5, 10) or small (100, 10000)?

First post, big fan :)

like image 685
daljaz Avatar asked Aug 25 '15 08:08

daljaz


People also ask

What is reserve in vector?

C++ Vector Library - reserve() Function The C++ function std::vector::reserve() requests to reserve vector capacity be at least enough to contain n elements. Reallocation happens if there is need of more space.

How vector memory is allocated?

Vectors are assigned memory in blocks of contiguous locations. When the memory allocated for the vector falls short of storing new elements, a new memory block is allocated to vector and all elements are copied from the old location to the new location. This reallocation of elements helps vectors to grow when required.

What is reserve function in c++?

The unordered_set::reserve() method is a builtin function in C++ STL which is used to request capacity change of unordered_set. It sets the number of buckets in the container to contain at least n elements.


2 Answers

Adding elements to a vector

There are several ways to add data to a vector:

  • create an empty vector, push_back() elements into it.
  • create an empty vector, allocate some capacity with reserve(), then push_back() elements into it.
  • create a vector of n elements, use indexing and copy-assignment.
  • create an empty vector, emplace_back() elements into it.
  • create an empty vector, allocate some capacity with reserve(), then emplace_back() elements into it.

There are other ways, e.g. creating the container with a pair of iterators, or filling it via standard library algorithms. I won't consider these here.

General considerations

The following two considerations are important for the following analysis:

  • Avoid (re)allocations: Memory allocation is slow. reallocation often involves copying everything that's already in the container to the new location.
  • Avoid unnecessary work: It's better to construct the element you want than to default-construct, then assign. It's better to construct an element where you want it than to construct it elsewhere, then copy.

Other factors will also affect the efficiency of the chosen solution, but these are significant factors that we have direct control over. Other factors may become obvious through profiling your code.

push_back()

Each push_back() copy-constructs an element in the vector from the argument to the push_back() call. If the vector size() == capacity(), a reallocation will be performed. This will usually (but may not always) double the capacity, and may result in copying all of the existing elements into new storage.

push_back() with pre-allocation

Using reserve() allocates enough memory for the elements before we start. It's always worth doing this if you know (or have a reasonable guess for) the number of elements. If you're guessing, over-estimates are better than under-estimates.

The push_back() call will still use the copy constructor of the element type, but there shouldn't be any allocations, because the space is already provided. You just pay the cost of a single allocation during the reserve() call. If you do run over the existing capacity, push_back() will reallocate, often doubling the capacity. This is why an over-estimate for the size is better; you're less likely to get a reallocation, whereas with an under-estimate, not only will you likely reallocate, but you'll waste memory allocating almost twice as much as you needed!

Note that the "doubling capacity" behaviour is not specified by the standard, but it is a common implementation, designed to reduce the reallocation frequency when using push_back() for data sets of unknown size.

indexing and element assignment

Here, we create a vector of the correct number of default-constructed elements, and then use the copy-assignment operator to replace them with the elements we want. This has only one allocation, but can be slow if copy-assignment does anything complicated. This doesn't really work for data sets of unknown (or only guessed) size; the element indexing is safe only if you know the index will never exceed size(), and you have to resort to push_back() or resizing if you need more. This isn't a good general solution, but it can work sometimes.

emplace_back()

emplace_back() constructs the element in-place with the arguments to the emplace_back() call. This can often be faster than the equivalent push_back() (but not always). It still allocates in the same pattern as push_back(), reserving some capacity, filling it, then reallocating when more is needed. Much of the same argument applies, but you can make some gains from the construction method.

emplace_back() with pre-allocation

This should be your go-to strategy for C++11 or later codebases. You gain the emplace_back() efficiency (where possible) and avoid repeated allocations. Of the mechanisms listed, this would be expected to be the fastest in most cases.

A note on efficiency

Efficiency can be measured in several ways. Execution time is a common measure, but not always the most relevant. General advice about which strategy to use is based on experience and essentially "averages" the various effects to provide some reasonable statements about what to do first. As always, if any kind of efficiency is critical for your application, the only way to be sure you're optimising the right place is to profile your code, make changes, and then profile it again to demonstrate that the changes had the desired effect. Different data types, hardware, I/O requirements, etc. can all affect this kind of timing, and you will never know how those effects combine in your particular application until you profile it.

Example

Live example: http://coliru.stacked-crooked.com/a/83d23c2d0dcee2ff

In this example, I fill a vector with 10,000 strings using each of the approaches listed above. I time each one and print the results.

This is similar to your question, but not identical; your results will differ!

Note that the emplace_back() with reserve() is the fastest, but the indexing & assignment is also quick here. This is likely because std::string has an efficient swap(), and its default constructor doesn't do much. The other approaches are an order of magnitude slower.

#include <chrono>
#include <iostream>
#include <vector>

using Clock = std::chrono::high_resolution_clock;
using time_point = std::chrono::time_point<Clock>;

std::vector<std::string> strings = {"one", "two", "three", "four", "five"};

std::chrono::duration<double> vector_push_back(const size_t n) {
    time_point start, end;
    start = Clock::now();

    std::vector<std::string> v;
    for (size_t i = 0; i < n; ++i) {
        v.push_back(strings[i % strings.size()]);
    }

    end = Clock::now();
    return end - start;
}

std::chrono::duration<double> vector_push_back_with_reserve(const size_t n) {
    time_point start, end;
    start = Clock::now();

    std::vector<std::string> v;
    v.reserve(n);
    for (size_t i = 0; i < n; ++i) {
        v.push_back(strings[i % strings.size()]);
    }

    end = Clock::now();
    return end - start;
}

std::chrono::duration<double> vector_element_assignment(const size_t n) {
    time_point start, end;
    start = Clock::now();

    std::vector<std::string> v(n);
    for (size_t i = 0; i < n; ++i) {
        v[i] = strings[i % strings.size()];
    }

    end = Clock::now();
    return end - start;
}

std::chrono::duration<double> vector_emplace_back(const size_t n) {
    time_point start, end;
    start = Clock::now();

    std::vector<std::string> v;
    for (size_t i = 0; i < n; ++i) {
        v.emplace_back(strings[i % strings.size()]);
    }

    end = Clock::now();
    return end - start;
}

std::chrono::duration<double> vector_emplace_back_with_reserve(const size_t n) {
    time_point start, end;
    start = Clock::now();

    std::vector<std::string> v;
    v.reserve(n);
    for (size_t i = 0; i < n; ++i) {
        v.emplace_back(strings[i % strings.size()]);
    }

    end = Clock::now();
    return end - start;
}

int main() {
    const size_t n = 10000;
    std::cout << "vector push_back: " << vector_push_back(n).count() << "\n";
    std::cout << "vector push_back with reserve: " << vector_push_back(n).count() << "\n";
    std::cout << "vector element assignment: " << vector_element_assignment(n).count() << "\n";
    std::cout << "vector emplace_back: " << vector_emplace_back(n).count() << "\n";
    std::cout << "vector emplace_back with reserve: " << vector_emplace_back_with_reserve(n).count() << "\n";
}

Results:

vector push_back: 0.00205563
vector push_back with reserve: 0.00152464
vector element assignment: 0.000610934
vector emplace_back: 0.00125141
vector emplace_back with reserve: 0.000545451

Conclusions

For most new code, using reserve() and emplace_back() (or push_back() for older code) should give you a good first-approximation for efficiency. If it isn't enough, profile it and find out where the bottleneck is. It probably won't be where you think it is.

like image 175
Andrew Avatar answered Sep 22 '22 20:09

Andrew


Better performance will be reached when avoiding dynamic reallocation, so try to have the vector memory be big enough to receive all elements.

Your first solution will be more efficient because if nSize is bigger than default vector capacity, the second one will need a reallocation to be able to store all elements.

As commented by Melkon, reserve is even better:

void myfnc_1(void *a_src, uint32_t a_segment) {
    size_t nSize = GetSize();
    std::vector<std::string> values;
    values.reserve( nSize );
    char* v = a_src;

    for (size_t i = 0; i < nSize; ++i) {
        values.push_back( std::string( v, a_segment ) );
        v += a_segment;
    }
}
like image 29
jpo38 Avatar answered Sep 20 '22 20:09

jpo38