Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Initializing An Array of Unknown Dimensionality

I was surprised that I couldn't find this question existing. I've tried to generalize it (with some nice untested code) to something everyone can benefit from.

Suppose I have a multidimensional Point:

template <int dims> class Point { public: double data[dims]; };

Now I create a multidimensional array of them:

 template <int dims> void foobar(int count0, ...) {
    //Using variadic function.  Could also use variadic templates in C++ (arguably better)
    int counts[dims], total_count=count0; counts[0]=count0;
    va_list args; va_start(args,count0);
    for (int i=1;i<dims;++i) {
        int count = va_arg(args,int);
        counts[i] = count;
        total_count *= count;
    }
    va_end(args);

    Point<dims>* array = new Point<dims>[total_count];

    //...
}

As you can see, array is a multidimensional array of unknown dimensionality, represented in a 1D array.


My question: how can I cleanly initialize this array to its multidimensional grid points?

Here's the example behavior I want in 1, 2, and 3 dimensions. Obviously, I don't want to write this for every possible dimensionality I might want to use! The goal is to generalize this.

//Example: dim==1
for (int x=0; x<counts[0]; ++x) {
    Point<1>& point = array[x];
    point.data[0] = (x+0.5) / (double)counts[0];
}

//Example: dim==2
for (int y=0; y<counts[1]; ++y) {
    for (int x=0; x<counts[0]; ++x) {
        Point<2>& point = array[y*counts[0]+x];
        point.data[0] = (x+0.5) / (double)counts[0];
        point.data[1] = (y+0.5) / (double)counts[1];
    }
}

//Example: dim==3
for (int z=0; z<counts[2]; ++z) {
    for (int y=0; y<counts[1]; ++y) {
        for (int x=0; x<counts[0]; ++x) {
            Point<3>& point = array[(z*counts[1]+y)*counts[0]+x];
            point.data[0] = (x+0.5) / (double)counts[0];
            point.data[1] = (y+0.5) / (double)counts[1];
            point.data[2] = (z+0.5) / (double)counts[2];
        }
    }
}

Again, my question: generalize the above for any number of nested loops/dimensions in a clean way.

Note: I've come up with a few nasty ways, and they're inelegant and slow. Especially, I want to avoid recursion, if possible, since this will be called on high-dimensional smallish datasets quite frequently. Note: There are obvious parallels in C, so either C or C++ is fine. C++11 preferred.

like image 786
imallett Avatar asked Aug 31 '15 22:08

imallett


2 Answers

Edit in response to comment and updated question

If you need performance and "elegancy" I would :

  • drop your multidimensional array approach and flatten it (i.e. one array dimension). No new, no pointer, use a C++ modern approach with std::vector or std::array.
  • provide an abstract container for your multi dim array with convenient methods such as a generic nested loop "generator"
  • Replace your variadic arguments with a fixed size array (since you know dims at compile time.

So I have found a following solution which is quite coherent with your implementation and needs, and try to remain simple.

I have managed rewriting a little MultiArray class in a "modern C++11 way". I consider here that the count dimensions might not be known at compile time hence the use of std::vector now. It is of course possible to have a more generic and compile time code with std::array see my original answer below.

#include <iostream>
#include <array>
#include <vector>
#include <numeric>

template<size_t DIMS>
class MultiArray {
public:
    // Point here is just an array
    using Point = std::array<double,DIMS>;

    // fill data_ with an init array
    // not that count is just a fix sized array here no variadic arguments needed
    MultiArray(const std::array<size_t, DIMS>& count) 
        : data_{init_array(count)} {}

private:   
    // the init functions are used for the constructor
    void init_point(Point& point, const std::array<size_t,DIMS>& coord, const std::array<size_t, DIMS>& count) {
        std::cout << " -> { ";
        for (size_t i = 0; i < DIMS; i ++) {
            point[i] = (coord[i] + 0.5) / count[i];
            std::cout << point[i] << ";";
        }
        std::cout << " }\n";
    }

    std::vector<Point> init_array(const std::array<size_t, DIMS>& count) {
        std::vector<Point> data(std::accumulate(count.begin(), count.end(), 1, std::multiplies<int>())); // accumulate computes the prod of DIMS total_count
        std::array<size_t, DIMS> current{};
        size_t i=0;
        do {
             for (size_t i = 0; i < DIMS; i ++)
                 std::cout << current[i] << ";";
            init_point(data[i++],current,count);
        } while (increment(current, count));
        return data;
    }

    // the following function allows to imitate the nested loop by incrementing multidim coordinates
    bool increment( std::array<size_t, DIMS>& v, const std::array<size_t, DIMS>& upper) {
        for (auto i = v.size(); i-- != 0; ) {
            ++v[i];
            if (v[i] != upper[i]) {
                return true;
            }
            v[i] = 0;
        }
        return false;
    }   
private:
    std::vector<Point> data_; // A flatten multi dim vector of points
};


int main() {
   std::array<size_t, 3> count{{4,5,3}};
   MultiArray<3> test{count};
}

Live on Coliru

As you can see in results data_ can be initialized generically for N dimensions. If you need a higher level abstract class, you can check below my original answer where you can perform some convenient operations (i.e. access to grid[{i,j,k}] for filling the values).

Original answer

I needed a multi dimensional grid for my needs and happened to ask improvements of my code on code review. Here a working example where of course some features might be unnecessary for you... My implementation relates on template and compile time computations. Note that the dimensions size must be known at compile time.

Briefly, the class would look like this :

template<typename T, size_t... DIMS> // variadic template here for the dimensions size
class MultiGrid {
   // Access from regular idx such as grid[64]
   T& operator[] (size_type idx)       { return values_[idx]; };
   // Access from multi dimensional coordinates such as grid[{6,3,4}]
   T& operator[] (const std::array<size_t,sizeof...(DIMS)>& coord)       { // can give code for runtime here };
private:
  std::array<T,sizeof...(DIMS)> data_;
}

And then you can construct your multidimensional array and initialize it these ways :

MultiGrid<float,DIM1,DIM2,DIM3> data; // 3D
// MultiGrid<float,DIM1,DIM2,DIM3,DIM4> data; // 4D
// etc...

// initialize it like this with nested arrays
for (size_t z=0; z < DIM3; z ++)
  for (size_t y=0; y < DIM2; y ++)
    for (size_t x=0; x < DIM1; x ++)
      data[{x,y,z}] = [...] // whatever

// or like this in C++11/14 way
for (auto &x : data) x = [...] // this is convenient to provide a container like approach since no nested arrays are needed here.

In case you need to specify an algorithm for variadic nested loops for filling out the values you can take a look here and do it this way with the first answer :

// here lower_bound is 0-filled vector
std::vector<int> current = lower_bound;
do {
   data[current] = [...] // fill in where current is a coordinate
} while (increment(current, lower_bound, upper_bound));

If you need something I miss in my implementation feel free to ask. I would also be glad if someone can point out improvements.

like image 101
coincoin Avatar answered Sep 29 '22 06:09

coincoin


Going from X,Y,Z to the flattened array (F) we have the following equation

F=(Z*DimY+y)*DimX+X

or

F=Z*DimY*DimX+Y*DimX+X

X = F % DimX
Y = F % DimX*DimY/DimX
Z = F % DimX*DimY*DimZ/DimX*DimY

in a 7 x 3 x 5 array, Z=3, Y=1, X=2 would be at 3*3*5 + 1*5 + 2= 45+5+2=52

X = `52` % 5 = 2
Y = `52` % (5 * 3) / 5 = 7 / 5 = 1
Z = `52` % (7 * 5 * 3)/15 = 52/15 = 3

in a 7 x 3 x 5 array, Z=4, Y=2, X=3 would be at 4*3*5 + 2*5 + 3= 60+10+3=73

X = `73` % 5 = 3
Y = `73` % (5 * 3) / 5 = 13 / 5 = 2
Z = `73` % (7 * 5 * 3)/15 = 73/15 = 4

If we keep the cumulative products in an array, mult { 1, X, X*Y, X*Y*Z, ...} and all points in an array, val

Points to flat array:

F=sum(mult[i]*val[i]);

Flat array to points:

i[0]=F%mult[1]/mult[0];
i[1]=F%mult[2]/mult[1];
...

We can then iterate over F (the flat array) , reverse engineer from the index into the flat array all points: X,Y,... as above and do the initialization you want in a generic loop:

Given mult as mult[0]=1; mult[d+1]=mult[d]*count[d];

for (int i = 0; i < total_count; ++i) {
    for (int d=0; d < dims; ++d) {
      int dim=(i%mult[d+1])/mult[d];
      point.data[d] = (dim+0.5) / (double)counts[d];
  }
}
like image 35
Glenn Teitelbaum Avatar answered Sep 29 '22 05:09

Glenn Teitelbaum