Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast element access for multi-dimensional representation of contiguous array

I have a multi-dimensional array represented contiguously in memory. I want to keep this representation hidden and just let the user access the array elements as if it were a multi-dimensional one: e.g. my_array[0][3][5] or my_array(0,3,5) or something similar. The dimensions of the object are not determined until runtime, but the object is created with a type that specifies how many dimensions it has. This element look-up will need to be called billions of times, and so should hopefully involve minimal overhead for each call.

I have looked at similar questions but not really found a good solution. Using the [] operator requires the creation of N-1 dimensional objects, which is fine for multi-dimensional structures like vectors-of-vectors because the object already exists, but for a contiguous array it seems like it would get convoluted very quickly and require some kind of slicing through the original array.

I have also looked at overloading (), which seems more promising, but requires specifying the number of arguments, which will vary depending upon the number of dimensions of the array. I have thought about using list initialization or vectors, but wanted to avoid instantiating objects.

I am only a little familiar with templates and figure that there should be some way with C++'s majestic template powers to specify a unique overloading of () for arrays with different types (e.g. different numbers of dimensions). But I have only used templates in really basic generic cases like making a function use both float and double.

I am imagining something like this:

template<typename TDim>
class MultiArray {
public:
  MultiArray() {} //build some things
  ~MultiArray() {} //destroy some things

  // The number of arguments would be == to TDim for the instantiated class
  float& operator() (int dim1, int dim2, ...) {
    //convert to contiguous index and return ref to element
    // I believe the conversion equation is something like:
    // dim1 + Maxdim1 * ( dim2 + MaxDim2 * ( dim3 + MaxDim3 * (...)))
  }

private:
  vector<float> internal_array;
  vector<int> MaxDimX; // Each element says how large each corresponding dim is.
};

So if I initialize this class and attempted to access an element, it would look something like this:

my_array = MultiArray<4>();
element = my_array(2,5,4,1);

How might I go about doing this using templates? Is this even possible?

like image 525
user5915738 Avatar asked Dec 07 '17 00:12

user5915738


People also ask

Are multidimensional arrays contiguous?

Yes they are contiguous.

How a multidimensional array is represented in memory?

Row-major order and column-major order are methods for storing multidimensional arrays in linear storage such as random access memory.

Which library can be used to support multidimensional array?

numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object and tools for working with these arrays.

What is multi-dimensional array array?

A multidimensional array in MATLAB® is an array with more than two dimensions. In a matrix, the two dimensions are represented by rows and columns. Each element is defined by two subscripts, the row index and the column index.


1 Answers

template<class T>
struct slice {
    T* data = 0;
    std::size_t const* stride = 0;
    slice operator[](std::size_t I)const {
        return{ data + I* *stride, stride + 1 };
    }
    operator T&()const {
      return *data;
    }
    T& operator=(typename std::remove_const<T>::type in)const {
      *data = std::move(in); return *data;
    }
};

store a vector<T> of data, and an std::vector<std::size_t> stride of strides, where stride[0] is the element-stride that the first index wants.

template<class T>
struct buffer {
  std::vector<T> data;
  std::vector<std::size_t> strides;

  buffer( std::vector<std::size_t> sizes, std::vector<T> d ):
    data(std::move(d)),
    strides(sizes)
  {
    std::size_t scale = 1;
    for (std::size_t i = 0; i<sizes.size(); ++i){
      auto next = scale*strides[sizes.size()-1-i];
      strides[sizes.size()-1-i] = scale;
      scale=next;
    }
  }
  slice<T> get(){ return {data.data(), strides.data()}; }
  slice<T const> get()const{ return {data.data(), strides.data()}; }
};

c++14. Live example.

If you use not enough []s it refers to the first element of the subarray in question. If you use too many it does UB. It does zero dimension checking, both in count of dimensions and in size.

Both can be added, but would cost performance.

The number of dimensions is dynamic. You can split buffer into two types, one that owns the buffer and the other that provides the dimensioned view of it.

like image 92
Yakk - Adam Nevraumont Avatar answered Sep 30 '22 13:09

Yakk - Adam Nevraumont