Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vector storage in C++

I wish to store a large vector of d-dimensional points (d fixed and small: <10).

If I define a Point as vector<int>, I think a vector<Point> would store in each position a pointer to a Point.

But if define a Point as a fixed-size object like: std::tuple<int,int,...,int> or std::array<int, d>, will the program store all points in contiguous memory or will the additional level of indirection remain?

In case the answer is that arrays avoid the additional indirection, could this have a large impact on performance (cache exploit locality) while scanning the vector<Point>?

like image 488
Joseph Stack Avatar asked Oct 28 '16 10:10

Joseph Stack


People also ask

What is vector in C programming?

A vector data structure is an enhancement over the standard arrays. Unlike arrays, which have their size fixed when they are defined; vectors can be resized easily according to the requirement of the user.

Does C have a vector library?

C Vector Library A simple vector library for C. This libary's vectors work in a similar manner to C++ vectors: they can store any type, their elements can be accessed via the [] operator, and elements may be added or removed with simple library calls.

How are vectors stored?

Vectors are assigned memory in blocks of contiguous locations. When the memory allocated for the vector falls short of storing new elements, a new memory block is allocated to vector and all elements are copied from the old location to the new location. This reallocation of elements helps vectors to grow when required.

What is vector data type in C?

A vector type represents a vector of as many of the specified C data type as will fit in a 128-bit register. Hence, the vector signed int is a 128-bit operand containing four 32-bit signed ints . The vector unsigned short is a 128-bit operand containing eight unsigned values.


2 Answers

If you define your Point as having contiguous data storage (e.g. struct Point { int a; int b; int c; } or using std::array), then std::vector<Point> will store the Points in contiguous memory locations, so your memory layout will be:

p0.a, p0.b, p0.c, p1.a, p1.b, p1.c, ..., p(N-1).a, p(N-1).b, p(N-1).c 

On the other hand, if you define Point as a vector<int>, then a vector<Point> has the layout of vector<vector<int>>, which is not contiguous, as vector stores pointers to dynamically allocated memory. So you have contiguity for single Points, but not for the whole structure.

The first solution is much more efficient than the second (as modern CPUs love accessing contiguous memory locations).

like image 188
Mr.C64 Avatar answered Sep 30 '22 21:09

Mr.C64


vector will store whatever your type contains in contiguous memory. So yes, if that's an array or a tuple, or probably even better, a custom type, it will avoid indirection.

Performance-wise, as always, you have to measure it. Don't speculate. At least as far as scanning is concerned.

However, there will definitely be a huge performance gain when you create those points in the first place, because you'll avoid unnecessary memory allocations for every vector that stores a point. And memory allocations are usually very expensive in C++.

like image 44
Sergei Tachenov Avatar answered Sep 30 '22 22:09

Sergei Tachenov