I wish to store a large vector of d-dimensional points (d fixed and small: <10).
If I define a Point
as vector<int>
, I think a vector<Point>
would store in each position a pointer to a Point.
But if define a Point
as a fixed-size object like: std::tuple<int,int,...,int>
or std::array<int, d>
, will the program store all points in contiguous memory or will the additional level of indirection remain?
In case the answer is that arrays avoid the additional indirection, could this have a large impact on performance (cache exploit locality) while scanning the vector<Point>
?
A vector data structure is an enhancement over the standard arrays. Unlike arrays, which have their size fixed when they are defined; vectors can be resized easily according to the requirement of the user.
C Vector Library A simple vector library for C. This libary's vectors work in a similar manner to C++ vectors: they can store any type, their elements can be accessed via the [] operator, and elements may be added or removed with simple library calls.
Vectors are assigned memory in blocks of contiguous locations. When the memory allocated for the vector falls short of storing new elements, a new memory block is allocated to vector and all elements are copied from the old location to the new location. This reallocation of elements helps vectors to grow when required.
A vector type represents a vector of as many of the specified C data type as will fit in a 128-bit register. Hence, the vector signed int is a 128-bit operand containing four 32-bit signed ints . The vector unsigned short is a 128-bit operand containing eight unsigned values.
If you define your Point
as having contiguous data storage (e.g. struct Point { int a; int b; int c; }
or using std::array
), then std::vector<Point>
will store the Point
s in contiguous memory locations, so your memory layout will be:
p0.a, p0.b, p0.c, p1.a, p1.b, p1.c, ..., p(N-1).a, p(N-1).b, p(N-1).c
On the other hand, if you define Point
as a vector<int>
, then a vector<Point>
has the layout of vector<vector<int>>
, which is not contiguous, as vector
stores pointers to dynamically allocated memory. So you have contiguity for single Point
s, but not for the whole structure.
The first solution is much more efficient than the second (as modern CPUs love accessing contiguous memory locations).
vector
will store whatever your type contains in contiguous memory. So yes, if that's an array
or a tuple
, or probably even better, a custom type, it will avoid indirection.
Performance-wise, as always, you have to measure it. Don't speculate. At least as far as scanning is concerned.
However, there will definitely be a huge performance gain when you create those points in the first place, because you'll avoid unnecessary memory allocations for every vector
that stores a point. And memory allocations are usually very expensive in C++.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With