Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ vector that *doesn't* initialize its members?

Tags:

c++

stl

vector

I'm making a C++ wrapper for a piece of C code that returns a large array, and so I've tried to return the data in a vector<unsigned char>.

Now the problem is, the data is on the order of megabytes, and vector unnecessarily initializes its storage, which essentially turns out to cut down my speed by half.

How do I prevent this?

Or, if it's not possible -- is there some other STL container that would avoid such needless work? Or must I end up making my own container?

(Pre-C++11)

Note:

I'm passing the vector as my output buffer. I'm not copying the data from elsewhere.
It's something like:

vector<unsigned char> buf(size);   // Why initialize?? GetMyDataFromC(&buf[0], buf.size()); 
like image 760
user541686 Avatar asked Jun 22 '12 03:06

user541686


People also ask

Are vectors default initialized?

Default value of the Vector:The default value of a vector is 0.

What is the correct way to initialize vector in C?

Begin Declare v of vector type. Call push_back() function to insert values into vector v.

Can you initialize a vector?

How to Initialize a Vector Using a Constructor in C++ We can also initialize vectors in constructors. We can make the values to be a bit dynamic. This way, we don't have to hardcode the vector's items.


2 Answers

For default and value initialization of structs with user-provided default constructors which don't explicitly initialize anything, no initialization is performed on unsigned char members:

struct uninitialized_char {     unsigned char m;     uninitialized_char() {} };  // just to be safe static_assert(1 == sizeof(uninitialized_char), "");  std::vector<uninitialized_char> v(4 * (1<<20));  GetMyDataFromC(reinterpret_cast<unsigned char*>(&v[0]), v.size()); 

I think this is even legal under the strict aliasing rules.

When I compared the construction time for v vs. a vector<unsigned char> I got ~8 µs vs ~12 ms. More than 1000x faster. Compiler was clang 3.2 with libc++ and flags: -std=c++11 -Os -fcatch-undefined-behavior -ftrapv -pedantic -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-missing-prototypes

C++11 has a helper for uninitialized storage, std::aligned_storage. Though it requires a compile time size.


Here's an added example, to compare total usage (times in nanoseconds):

VERSION=1 (vector<unsigned char>):

clang++ -std=c++14 -stdlib=libc++ main.cpp -DVERSION=1 -ftrapv -Weverything -Wno-c++98-compat -Wno-sign-conversion -Wno-sign-compare -Os && ./a.out  initialization+first use: 16,425,554 array initialization: 12,228,039 first use: 4,197,515 second use: 4,404,043 

VERSION=2 (vector<uninitialized_char>):

clang++ -std=c++14 -stdlib=libc++ main.cpp -DVERSION=2 -ftrapv -Weverything -Wno-c++98-compat -Wno-sign-conversion -Wno-sign-compare -Os && ./a.out  initialization+first use: 7,523,216 array initialization: 12,782 first use: 7,510,434 second use: 4,155,241 


#include <iostream> #include <chrono> #include <vector>  struct uninitialized_char {   unsigned char c;   uninitialized_char() {} };  void foo(unsigned char *c, int size) {   for (int i = 0; i < size; ++i) {     c[i] = '\0';   } }  int main() {   auto start = std::chrono::steady_clock::now();  #if VERSION==1   using element_type = unsigned char; #elif VERSION==2   using element_type = uninitialized_char; #endif    std::vector<element_type> v(4 * (1<<20));    auto end = std::chrono::steady_clock::now();    foo(reinterpret_cast<unsigned char*>(v.data()), v.size());    auto end2 = std::chrono::steady_clock::now();    foo(reinterpret_cast<unsigned char*>(v.data()), v.size());    auto end3 = std::chrono::steady_clock::now();    std::cout.imbue(std::locale(""));   std::cout << "initialization+first use: " << std::chrono::nanoseconds(end2-start).count() << '\n';   std::cout << "array initialization: " << std::chrono::nanoseconds(end-start).count() << '\n';   std::cout << "first use: " << std::chrono::nanoseconds(end2-end).count() << '\n';   std::cout << "second use: " << std::chrono::nanoseconds(end3-end2).count() << '\n'; } 

I'm using clang svn-3.6.0 r218006

like image 144
bames53 Avatar answered Sep 27 '22 19:09

bames53


Sorry, there's no way to avoid it.

C++11 adds a constructor that takes only a size, but even that will value-initialize the data.

Your best bet is to just allocate an array on the heap, stick it in a unique_ptr (where available), and use it from there.

If you're willing to, as you say, "hacking into STL," you could always grab a copy of EASTL to work from. It's a variation of certain STL containers that allows for more restricted memory conditions. A proper implementation of what you're trying to do would be to give its constructor a special value that means "default initialize the members," which for POD types means to do nothing to initialize the memory. This requires using some template metaprogramming to detect if it is a POD type, of course.

like image 41
Nicol Bolas Avatar answered Sep 27 '22 20:09

Nicol Bolas