I'm trying to figure out how I can iterate through a container (like std::vector) of objects that share a common base parent class contiguously in memory.
To demonstrate the problem let's use the following examples.
class Base
{
public:
Base();
virtual void doStuff() = 0;
};
class DerivedA : public Base
{
private:
//specific A member variables
public:
DerivedA();
virtual void doStuff();
};
class DerivedB : public Base
{
private:
//specific B member variables
public:
DerivedB();
virtual void doStuff();
};
Now, using std::vector to iterate would keep the objects in contiguous memory but we would experience slicing as there isn't room for the derived properties.
So we have to use polymorphic technique using pointers like so
int main ()
{
std::vector<Base*> container;
container.push_back(new DerivedA());
container.push_back(new DerivedB());
for (std::vector<Base*>::iterator i = container.begin(); i!=container.end(); i++)
{
(*(*i)).doStuff();
}
}
As far as I know that should work fine given that the classes are implemented.
Problem: Now, the vector contains pointers in contiguous memory but that does not mean that the addresses they are pointing to are.
So if I want to be able to delete and insert objects into the vector on the fly at any time, the objects will be spread all over the place in memory.
Question: It seems like everyone suggests doing it the std::vector way but why isn't it considered problematic that it isn't iterable contiguously in memory (assuming we actually use the pointer)?
Am I forced to do it the copy-pasta way?
int main ()
{
std::vector<DerivedA> containerA;
DerivedA a;
containerA.push_back(a);
std::vector<DerivedB> containerB;
DerivedB b;
containerB.push_back(b);
for (std::vector<DerivedA>::iterator i = containerA.begin(); i!=container.end(); i++)
{
(*i).doStuff();
}
for (std::vector<DerivedB>::iterator i = containerB.begin(); i!=container.end(); i++)
{
(*i).doStuff();
}
}
I'm guessing there might not be a real solution to this since keeping objects of various sizes linearly in memory doesn't really make sense but if anyone can give me some advice I'd appreciate it.
Let's take the questions in order.
A: You can't.
Suppose you used some placement new shenanigans to arrange your objects in memory like this:
[B ][DA ][DB ][B ][B ][DB ][DA ]
How would the iteration mechanism know how far to advance the iteration pointer from one object to the next? The number of bytes from the first element to the second is different from the second to the third.
The reason contiguous arrays have to be homogenous is so the distance from one object to the next is the same for all elements. You might try to embed a size in each object, but then you basically have a linked list rather than an array (albeit one with good locality).
This reasoning leads to the idea to use an array of pointers, about which you have posed the next question:
A: It is not as slow as you think.
Your concern seems to be the performance of following pointers to scattered memory locations. But the cost of following these pointers is unlikely to be dominant. Don't get hung up on micro-optimizations like memory layout until you have solid evidence they are needed.
A: No!
Here, the concern seems to be maintainability rather than performance. Maintainability is much more important in my opinion, and a good thing to think about early.
For maintainability, you already have a fine solution: maintain a
vector of Base*
.
If you really want to use multiple vectors, there is still a better way than copy and paste: use a template function, like this (untested):
template <class T>
void doStuffToVector(std::vector<T> &vec)
{
for (std::vector<T>::iterator i = vec.begin(); i!=vec.end(); ++i) {
(*i).doStuff();
}
}
Then call it on each container:
doStuffToVector(containerA);
doStuffToVector(containerB);
If your only concern is maintainability, either the vector of pointers or the template function should suffice.
A: For starters, ignore performance, at least as far as constant factors are concerned. Concentrate on correctness and maintainability.
Then, measure performance. Observe that this question did not begin with a statement of current and desired speed. You don't yet have an actual problem to solve!
After measurement, if you conclude it's too slow, use a profiler to find out where the slow spots are. They are almost never where you think they will be.
Case in point: this entire question and answers have been focused on
the iteration, but no one has raised the point that the virtual function
calls to doStuff
are much more likely to be the bottleneck! Virtual
function calls are expensive because they are indirect control flow,
which causes major problems for the
pipeline;
indirect data access is much less expensive because the
data cache is usually very
effective at satisfying data access requests quickly.
A: After careful measurement, you might possibly find that this code
(the iteration itself, including virtual function dispatch; not what's
inside doStuff
) is a bottleneck. That must mean it's being executed
for billions of iterations, minimum.
First, look into algorithmic improvements that will reduce the number of iterations required.
Next, eliminate the virtual function call, for example by embedding an
explicit indicator of object type and testing it with if
or switch
.
That will allow the processor's
branch predictor to
be much more effective.
Finally, yes, you'd probably want to put all of the elements into one
contiguous array to improve locality and eliminate the indirect data
access. That will mean eliminating the class hierarchy too so all
objects are the same type, either combining all of the fields into a
single class and/or using a union
. This will harm your program's
maintainability! That is sometimes one of the costs of writing high
performance code, but is actually necessary only very rarely.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With