I try to figure out the best way to write a general container (vectors, matrices, higher-dimensional objects) for an arbitrary number of dimensions. The number of dimensions as well as the number of elements per dimension shall be specified at compile time like this:
3
gives a vector with 3 elements10, 10
gives a matrix with 100 elements7, 5, 3
gives a second-order tensor with 105 elementsOne should be able to loop over all elements in an easy way for simple manipulations: multiply all (double
) elements by a scalar double
, add two compatible containers element-wise etc. Additionally, one should be able to loop over all elements knowing their respective index per dimension, e.g. from (0, 0, 0)
to (6, 4, 2)
for a tensor.
I thought that variadic template parameters are a nice tool for recursively connecting iterators for each dimension.
template<
int N, // the number of elements to iterate for this dimension
int... otherN // the other dimension's lengths
> class dimIterator;
For storing the pointer to the first element, I had a boxIterator
in mind which is just a wrapper for the dimIterator
s
template<
typename dataType,
int... allN
> class boxIterator : public dimIterator<allN...> { // tried as member or inheritance
protected:
dataType* data;
public:
boxIterator(dataType* data) :
dimIterator<allN...>(),
data(data)
{
//cout << "constructing box iterator." << endl;
}
int getIndex() const {
int result = dimIterator<allN...>::getIndex();
return result;
}
dataType* getPtr() {
dataType* result = data + this->getIndex();
return result;
}
const dataType* getPtr() const {
dataType* result = data + this->getIndex();
return result;
}
bool isDone() const {return dimIterator<allN...>::isDone();}
boxIterator<dataType, allN...>& operator ++() {
dimIterator<allN...>::operator++();
return *this;
}
dataType& operator *() {return *(this->getPtr());}
const dataType& operator *() const {return *(this->getPtr());}
};
template<
int N, // the number of elements to iterate for this dimension
int... otherN // the other dimension's lengths
> class dimIterator : public dimIterator<otherN...> { // tried as member or inheritance
protected:
int i;
public:
dimIterator() :
dimIterator<otherN...>(),
i(0)
{
//cout << "constructing level with " << N << " gridpoints." << endl;
}
int getIndex() const {
int result = i + N*dimIterator<otherN...>::getIndex();
return result;
}
bool isDone() const {return dimIterator<otherN...>::isDone();}
dimIterator<N, otherN...>& operator ++() {
if(i<N-1) {
++i;
}
else {
i = 0;
dimIterator<otherN...>::operator++();
}
return *this;
}
};
template<int N> // stop recursion if only one dimension left
class dimIterator<N> {
protected:
int i;
public:
dimIterator() :
i(0)
{
//cout << "constructing last level with " << N << " gridpoints." << endl;
}
int getIndex() const {
int result = i;
return result;
}
bool isDone() const {return ( i>= N );}
dimIterator<N>& operator ++() {
++i;
return *this;
}
};
Initially, I was quite happy with that approach because it allowed to write the same high-level iterator loop for an arbitrary number of dimensions. The indices for each dimension and similar things like the pitch for a given box in a certain dimension could be obtained easily.
However, eventhough I tried to use templates and inline functions for the iterator logic, the compiler does not optimize the stuff to something which is as fast as executing nested for
loops. I had a test with an empty multiplication of uninitialized doubles and repeatable get
for
loopsfor
loops?Compiled with g++ -O3 -std=c++11
. Version is g++ (GCC) 4.8.1
.
Full code (partially repeated from above):
#include <iostream>
using namespace std;
template<int first, int... other>
class integerPack {
public:
constexpr static int length = 1 + integerPack<other...>::length;
constexpr static int product = first*integerPack<other...>::product;
};
template<int only>
class integerPack<only> {
public:
constexpr static int length = 1;
constexpr static int product = only;
};
template<
int N, // the number of elements to iterate for this dimension
int... otherN // the other dimension's lengths
> class dimIterator;
template<
typename dataType,
int... allN
> class boxIterator : public dimIterator<allN...> { // tried as member or inheritance
protected:
dataType* data;
public:
boxIterator(dataType* data) :
dimIterator<allN...>(),
data(data)
{
//cout << "constructing box iterator." << endl;
}
int getIndex() const {
int result = dimIterator<allN...>::getIndex();
return result;
}
dataType* getPtr() {
dataType* result = data + this->getIndex();
return result;
}
const dataType* getPtr() const {
dataType* result = data + this->getIndex();
return result;
}
bool isDone() const {return dimIterator<allN...>::isDone();}
boxIterator<dataType, allN...>& operator ++() {
dimIterator<allN...>::operator++();
return *this;
}
dataType& operator *() {return *(this->getPtr());}
const dataType& operator *() const {return *(this->getPtr());}
};
template<
int N, // the number of elements to iterate for this dimension
int... otherN // the other dimension's lengths
> class dimIterator : public dimIterator<otherN...> { // tried as member or inheritance
protected:
int i;
public:
dimIterator() :
dimIterator<otherN...>(),
i(0)
{
//cout << "constructing level with " << N << " gridpoints." << endl;
}
int getIndex() const {
int result = i + N*dimIterator<otherN...>::getIndex();
return result;
}
bool isDone() const {return dimIterator<otherN...>::isDone();}
dimIterator<N, otherN...>& operator ++() {
if(i<N-1) {
++i;
}
else {
i = 0;
dimIterator<otherN...>::operator++();
}
return *this;
}
};
template<int N> // stop recursion if only one dimension left
class dimIterator<N> {
protected:
int i;
public:
dimIterator() :
i(0)
{
//cout << "constructing last level with " << N << " gridpoints." << endl;
}
int getIndex() const {
int result = i;
return result;
}
bool isDone() const {return ( i>= N );}
dimIterator<N>& operator ++() {
++i;
return *this;
}
};
template<
int... allN
> class box {
public:
constexpr static int dimension = integerPack<allN...>::length;
constexpr static int NN = integerPack<allN...>::product;
template<typename dataType>
using iterator = boxIterator<dataType, allN...>;
};
template<typename dataType, typename boxType>
class boxQuantity {
public:
typedef typename boxType::template iterator<dataType> iterator;
constexpr static int dimension = boxType::dimension;
constexpr static int NN = boxType::NN;
typedef boxQuantity<dataType, boxType> thisClass;
protected:
boxType mybox;
dataType* data;
iterator myit;
public:
boxQuantity(
const boxType& mybox
) :
mybox(mybox),
data(new dataType[NN]),
myit(data)
{
cout << "I am a quantity of dimension " << dimension
<< " with " << NN << " gridpoints." << endl;
}
virtual ~boxQuantity() {
delete[] data;
}
iterator begin() const {
iterator it(data);
return it;
}
dataType& operator [] (int i) {return data[i];}
const dataType& operator [] (int i) const {return data[i];}
// iterator syntax with recursive for-like logic: 5 s
virtual thisClass& operator *= (const thisClass& rhs) {
thisClass& lhs = *this;
for(iterator it=lhs.begin(); !it.isDone(); ++it) {
lhs[myit.getIndex()] *= rhs[myit.getIndex()];
}
return *this;
}
// plain nested native for loops: 2.5 s
/*virtual thisClass& operator *= (const thisClass& rhs) {
thisClass& lhs = *this;
for(int yindex=0; yindex<1000; ++yindex) {
for(int xindex=0; xindex<1000; ++xindex) {
lhs[yindex*1000 + xindex] *= rhs[yindex*1000 + xindex];
}
}
return *this;
}*/
};
typedef box<1000, 1000> boxType;
typedef boxQuantity<double, boxType> qType;
int main() {
boxType qBox;
qType q1(qBox);
qType q2(qBox);
for(int i=0; i<2000; ++i) {
q1 *= q2;
}
return 0;
}
virtual
keywordThanks for your answer. The code was assembled by taking pieces of a bigger program where the virtual
is not that obviously useless as in my example. I expected that the virtual
mainly influences the time for calling the function/operator, but not for executing it. As I know that the main part of the execution time is spent inside the loops, I forgot to remove the virtual
. However, removing the virtual
does not cause a significant performance benefit in my case? (I tested it after your advice.)
What I probably missed to point out is that I would like to iterate over all elements in a contiguous block of memory and to be able to get the n-dimensional indices for each element. Furthermore, I would like to be able to access the neighbour elements in the direction of a given dimension. Maybe it was a bad idea to remove those functions like e.g. (inside dimIterator
)
template<int dindex>
int getCoord() const {
if(1 == dindex) {
return i;
}
return otherDims.template getCoord<dindex-1>();
}
My previous idea was to
However, I would prefer a general solution like above. I can not imagine that it has to be that much slower because I do not see the effective difference compared to nested loops.
The boxQuantity
's virtual operator*=()
will not be inlined, which prevents a lot of optimizations. In your example you don't derive from this class. Removing this virtual
operator enables the code to be inlined and the SIMD instructions to be used:
00861330 movsd xmm0,mmword ptr [edx+eax*8]
00861335 mulsd xmm0,mmword ptr [edi+eax*8]
0086133A movsd mmword ptr [edi+eax*8],xmm0
0086133F dec ecx
00861340 jne main+90h (0861330h)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With