Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the fastest way to reinitialize a vector?

What is the fastest way to reset all values for a large vector to its default values?

struct foo
{
  int id;
  float score;
};

std::vector<foo> large_vector(10000000);

The simplest way would be to create a new vector, but I guess it takes more time to reallocate memory than to reinitialize an existing one?

I have to iterate over the vector to collect non-zero scores (could be thousands or millions) before resetting it. Should I reset the structs one by one in this loop?

Edit:

The vector size is fixed and 'default value' means 0 for every struct member (all floats and ints).

like image 221
Ben Avatar asked Mar 20 '19 17:03

Ben


People also ask

Can you pop a vector?

In C++, you can simply remove or pop the first member in a vector with this method. The std::vector does not include a function for removing or popping an element from the top of the list, as you may already know. Push back and pop back are the only two functions available for adding and removing components.

What does vector Clear () do?

vector::clear() clear() function is used to remove all the elements of the vector container, thus making it size 0.

How do I reset a vector to zero?

If it's just a vector of integers, I'd first try: memset(&my_vector[0], 0, my_vector. size() * sizeof my_vector[0]);


3 Answers

What's the fastest way to reinitialize a vector?

Don't.

Just record the fact that the vector has no valid entries by calling clear(). This has the advantage of being both (probably) optimal, and guaranteed correct, and also being perfectly expressive. IMO none of the suggested alternatives should be considered unless profiling shows an actual need.

Your element type is trivial, so the linear upper-bound on complexity should in reality be constant for a decent quality implementation - there's no need to destroy each element in turn.

No memory is deallocated, or needs to be re-allocated later.

You'll just need to push_back or emplace_back when you're writing into the vector after clear()ing, instead of using operator[].

To make this consistent with the first use, don't initialize your vector with 10000000 value-constructed elements, but use reserve(10000000) to pre-allocate without initialization.

eg.

int main() {
  vector<foo> v;
  v.reserve(10000000);

  while(keep_running) {
    use(v);
    v.clear();
  }
}

// precondition: v is empty, so
// don't access v[i] until you've done
//   v.push_back({id,score})
// at least i+1 times
void use(vector<foo> &v) {
}

Since you need to zero your elements in-place, the second fastest general-purpose solution is probably to alter the loop above to

  while(keep_running) {
    v.resize(10000000);
    use(v);
    v.clear();
  }

or alternatively to remove the clear() and use fill() to overwrite all elements in-place.

If non-zero elements are sparse, as may be the case if you're updating them based on some meaningful index, it might be faster to zero them on the fly as your main loop iterates over the vector.

Again, you really need to profile to find out which is better for your use case.

like image 176
Useless Avatar answered Sep 22 '22 18:09

Useless


In order to determine the fastest way you will need to run some benchmarks.

There are a number of different ways to "reinitialise" a vector:

  1. Call clear(), for trivial types this should be roughly equivalent to just doing vector.size = 0. The capacity of the vector doesn't change and no elements are deallocated. Destructors will be called on elements if they exist. As you push_back, emplace_back or resize the vector the old values will be overwritten.
  2. Call assign(), e.g. large_vector.assign( large_vector.size(), Foo() );. This will iterate through the whole vector resetting every element to its default value. Hopefully the compiler will manage to optimise this to a memset or similar.
  3. As your type is trivial, if you want to just reset every element to 0 you should be able to do a memset, e.g.: memset( large_vector.data(), 0, sizeof(Foo)*large_vector.size() );.
  4. Call std::fill e.g. std::fill( large_vector.begin(), large_vector.end(), Foo() );, this should be similar to assign or memset.
like image 34
Alan Birtles Avatar answered Sep 21 '22 18:09

Alan Birtles


What is the fastest way to reset all values for a large vector to its default values?

Depends on what vector in its "default values" means.

If you want to remove all elements, most efficient is std::vector::clear.

If you want to keep all elements in the vector but set their state, then you can use std::fill:

std::fill(large_vector.begin(), large_vector.end(), default_value);

If the element type is trivial, and the "default value" is zero, then std::memset may be optimal:

static_assert(std::is_trivially_copyable_v<decltype(large_vector[0])>);
std::memset(large_vector.data(), 0, large_vector.size() * sizeof(large_vector[0]));

To verify that std::memset is worth the trouble, you should measure (or inspect assembly). The optimiser may do the work for you.

Zero in the sense that all bits are unset. C++ does not guarantee that this is a representation for a zero float. It also doesn't guarantee it to be a null pointer, in case your non-minimal use case uses pointers.

like image 27
eerorika Avatar answered Sep 24 '22 18:09

eerorika