I am writing a piece of code with high demands on performance where I need to handle a large number of objects in a polymorphic way. Let's say I have a class A and a class B which is derived from A. I could now create a vector of B:s like this
vector<A*> a(n);
for(int i = 0; i < n; i++)
a[i] = new B();
but if n is large (in the order 10^6 or more in my case) this would require very many calls to new and moreover the n objects could potentially be spread out all over my main memory resulting in very poor cache performance. What would be the right way to deal with this situation? I am thinking of doing something like the following to have all the objects in a contiguous memory region.
B* b = new B[n];
vector<A*> a(n);
for(int i = 0; i < n; i++)
a[i] = b + i;
but one problem is how to free up the memory allocated by new B[n] if b is not available anymore (but we still have a). I have just learnt that trying
delete[] a[0];
is not a good idea...
Programs with good locality generally run faster as they have lower cache miss rate in comparison with the ones with bad locality. In a good programming practice, cache performance is always counted as one of the important factor when it comes to the analysis of the performance of a program.
Some more general algorithms, such as Cooley–Tukey FFT, are optimally cache-oblivious under certain choices of parameters. As these algorithms are only optimal in an asymptotic sense (ignoring constant factors), further machine-specific tuning may be required to obtain nearly optimal performance in an absolute sense.
Cache-friendly data structures fit within a cache line and are aligned to memory such that they make optimal use of cache lines. A common example of a cache-friendly data structure is a two-dimensional matrix. We can set its row dimension to fit in a cache size block for optimal performance.
If you know for sure that those will only be objects of type B
why not use a parallel vector:
vector<B> storage(n);
vector<A*> pointers(n);
for(int i = 0; i < n; i++)
pointers[i] = &storage[i];
You can use placement new to construct an object at a particular memory location:
vector<A*> a(n);
for(int i = 0; i < n; i++)
a[i] = new(storage + i*object_size) B();
// and invoke the destructor manually to release the object (assuming A has a virtual destructor!)
a[i]->~A();
But you cannot solve the 'real' problem without giving up the continuous storage: if one object is freed, it will cause a hole in the heap, thus causing high fragmentation over time. You could only keep track of the freed objects and re-use the storage.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With