I'm building a suite of functions to work with a multidimensional-array data structure and I want to be able to define arbitrary slices of the arrays so I can implement a generalized inner product of two arbitrary matrices (aka Tensors or n-d arrays).
An APL paper I read (I honestly can't find which -- I've read so many) defines the matrix product on left-matrix X
with dimensions A;B;C;D;E;F
and right-matrix Y
with dimensions G;H;I;J;K
where F==G
as
Z <- X +.× Y
Z[A;B;C;D;E;H;I;J;K] <- +/ X[A;B;C;D;E;*] × Y[*;H;I;J;K]
where +/
is sum of, and × applies element-by-element to two vectors of the same length.
So I need "row" slices of the left and "column" slices of the right. I can of course take a transpose and then a "row" slice to simulate a "column" slice, but I'd rather do it more elegantly.
Wikipedia's article on slicing leads to a stub about dope vectors which seem to be the miracle cure I'm looking for, but there's not a lot there to go on.
How do I use a dope vector to implement arbitrary slicing?
(Much later I did notice Stride of an array which has some details.)
General array slicing can be implemented (whether or not built into the language) by referencing every array through a dope vector or descriptor — a record that contains the address of the first array element, and then the range of each index and the corresponding coefficient in the indexing formula. This technique also allows immediate array transposition, index reversal, subsampling, etc. For languages like C, where the indices always start at zero, the dope vector of an array with d indices has at least 1 + 2d parameters.
http://en.wikipedia.org/wiki/Array_slicing#Details
That's a dense paragraph, but it's actually all in there. So we need a data structure like this:
struct {
TYPE *data; //address of first array element
int rank; //number of dimensions
int *dims; //size of each dimension
int *weight; //corresponding coefficient in the indexing formula
};
Where TYPE
is whatever the element type is, the field of the matrices. For simplicity and concreteness, we'll just use int
. For my own purposes, I've devised an encoding of various types into integer handles so int
does the job for me, YMMV.
typedef struct arr {
int rank;
int *dims;
int *weight;
int *data;
} *arr;
All of the pointer members can be assigned locations within the same allocated block of memory as the struct itself (which we will call the header). But by replacing the earlier use of offsets and struct-hackery, the implementation of algorithms can be made independent of that actual memory layout within (or without) the block.
The basic memory layout for a self-contained array object is
rank dims weight data
dims[0] dims[1] ... dims[rank-1]
weight[0] weight[1] ... weight[rank-1]
data[0] data[1] ... data[ product(dims)-1 ]
An indirect array sharing data (whole array or 1 or more row-slices) will have the following memory layout
rank dims weight data
dims[0] dims[1] ... dims[rank-1]
weight[0] weight[1] ... weight[rank-1]
//no data! it's somewhere else!
And an indirect array containing an orthogonal slice along another axis will have the same layout as a basic indirect array, but with dims and weight suitably modified.
The access formula for an element with indices (i0 i1 ... iN) is
a->data[ i0*a->weight[0] + i1*a->weight[1] + ...
+ iN*a->weight[N] ]
, assuming each index i[j] is between [ 0 ... dims[j] ).
In the weight
vector for a normally laid-out row-major array, each element is the product of all lower dimensions.
for (int i=0; i<rank; i++)
weight[i] = product(dims[i+1 .. rank-1]);
So for a 3×4×5 array, the metadata would be
{ .rank=3, .dims=(int[]){3,4,5}, .weight=(int[]){4*5, 5, 1} }
or for a 7×6×5×4×3×2 array, the metadata would be
{ .rank=6, .dims={7,6,5,4,3,2}, .weight={720, 120, 24, 6, 2, 1} }
So, to create one of these, we need the same helper function from the previous question to compute the size from a list of dimensions.
/* multiply together rank integers in dims array */
int productdims(int rank, int *dims){
int i,z=1;
for(i=0; i<rank; i++)
z *= dims[i];
return z;
}
To allocate, simply malloc
enough memory and set the pointers to the appropriate places.
/* create array given rank and int[] dims */
arr arraya(int rank, int dims[]){
int datasz;
int i;
int x;
arr z;
datasz=productdims(rank,dims);
z=malloc(sizeof(struct arr)
+ (rank+rank+datasz)*sizeof(int));
z->rank = rank;
z->dims = z + 1;
z->weight = z->dims + rank;
z->data = z->weight + rank;
memmove(z->dims,dims,rank*sizeof(int));
for(x=1, i=rank-1; i>=0; i--){
z->weight[i] = x;
x *= z->dims[i];
}
return z;
}
And using the same trick from the previous answer, we can make a variable-argument interface to make usage easier.
/* load rank integers from va_list into int[] dims */
void loaddimsv(int rank, int dims[], va_list ap){
int i;
for (i=0; i<rank; i++){
dims[i]=va_arg(ap,int);
}
}
/* create a new array with specified rank and dimensions */
arr (array)(int rank, ...){
va_list ap;
//int *dims=calloc(rank,sizeof(int));
int dims[rank];
int i;
int x;
arr z;
va_start(ap,rank);
loaddimsv(rank,dims,ap);
va_end(ap);
z = arraya(rank,dims);
//free(dims);
return z;
}
And even automatically produce the rank argument by counting the other arguments using the awesome powers of ppnarg.
#define array(...) (array)(PP_NARG(__VA_ARGS__),__VA_ARGS__) /* create a new array with specified dimensions */
Now constructing one of these is very easy.
arr a = array(2,3,4); // create a dynamic [2][3][4] array
An element is retrieved with a function call to elema
which multiplies each index by the corresponding weight, sums them, and indexes the data
pointer. We return a pointer to the element so it can be read or modified by the caller.
/* access element of a indexed by int[] */
int *elema(arr a, int *ind){
int idx = 0;
int i;
for (i=0; i<a->rank; i++){
idx += ind[i] * a->weight[i];
}
return a->data + idx;
}
The same varargs trick can make an easier interface elem(a,i,j,k)
.
To take a slice, first we need a way of specifying which dimensions to extract and which to collapse. If we just need to select a single index or all elements from a dimension, then the slice
function can take rank int
s as arguments and interpret -1 as selecting the whole dimension or 0..dimsi-1 as selecting a single index.
/* take a computed slice of a following spec[] instructions
if spec[i] >= 0 and spec[i] < a->rank, then spec[i] selects
that index from dimension i.
if spec[i] == -1, then spec[i] selects the entire dimension i.
*/
arr slicea(arr a, int spec[]){
int i,j;
int rank;
for (i=0,rank=0; i<a->rank; i++)
rank+=spec[i]==-1;
int dims[rank];
int weight[rank];
for (i=0,j=0; i<rank; i++,j++){
while (spec[j]!=-1) j++;
if (j>=a->rank) break;
dims[i] = a->dims[j];
weight[i] = a->weight[j];
}
arr z = casta(a->data, rank, dims);
memcpy(z->weight,weight,rank*sizeof(int));
for (j=0; j<a->rank; j++){
if (spec[j]!=-1)
z->data += spec[j] * a->weight[j];
}
return z;
}
All the dimensions and weights corresponding to the -1s in the argument array are collected and used to create a new array header. All arguments >= 0 are multiplied by their associated weight and added to the data
pointer, offsetting the pointer to the correct element.
In terms of the array access formula, we're treating it as a polynomial.
offset = constant + sum_i=0,n( weight[i] * index[i] )
So for any dimension from which we're selecting a single element (+ all lower dimensions), we factor-out the now-constant index and add it to the constant term in the formula (which in our C representation is the data
pointer itself).
The helper function casta
creates the new array header with shared data
. slicea
of course changes the weight values, but by calculating weights itself, casta
becomes a more generally usable function. It can even be used to construct a dynamic array structure that operates directly on a regular C-style multidimensional array, thus casting.
/* create an array header to access existing data in multidimensional layout */
arr casta(int *data, int rank, int dims[]){
int i,x;
arr z=malloc(sizeof(struct arr)
+ (rank+rank)*sizeof(int));
z->rank = rank;
z->dims = z + 1;
z->weight = z->dims + rank;
z->data = data;
memmove(z->dims,dims,rank*sizeof(int));
for(x=1, i=rank-1; i>=0; i--){
z->weight[i] = x;
x *= z->dims[i];
}
return z;
}
The dope vector can also be used to implement transposes. The order of the dimensions (and indices) can be changed.
Remember that this is not a normal 'transpose' like everybody else does. We don't rearrange the data at all. This is more properly called a 'dope-vector pseudo-transpose'. Instead of changing the data, we just change the constants in the access formula, rearranging the coefficients of the polynomial. In a real sense, this is just an application of the commutativity and associativity of addition.
So for concreteness, assume the data is arranged sequentially starting at hypothetical address 500.
500: 0
501: 1
502: 2
503: 3
504: 4
505: 5
506: 6
if a is rank 2, dims {1, 7), weight (7, 1), then the sum of the indices multiplied by the associated weights added to the initial pointer (500) yield the appropriate addresses for each element
a[0][0] == *(500+0*7+0*1)
a[0][1] == *(500+0*7+1*1)
a[0][2] == *(500+0*7+2*1)
a[0][3] == *(500+0*7+3*1)
a[0][4] == *(500+0*7+4*1)
a[0][5] == *(500+0*7+5*1)
a[0][6] == *(500+0*7+6*1)
So the dope-vector pseudo-transpose rearranges the weights and dimensions to match the new ordering of indices, but the sum remains the same. The initial pointer remains the same. The data does not move.
b[0][0] == *(500+0*1+0*7)
b[1][0] == *(500+1*1+0*7)
b[2][0] == *(500+2*1+0*7)
b[3][0] == *(500+3*1+0*7)
b[4][0] == *(500+4*1+0*7)
b[5][0] == *(500+5*1+0*7)
b[6][0] == *(500+6*1+0*7)
So, by using general slices or transpose+"row"-slices (which are easier), generalized inner product can be implemented.
First we need the two helper functions for applying a binary operation to two vectors producing a vector result, and reducing a vector with a binary operation producing a scalar result.
As in the previous question we'll pass in the operator, so the same function can be used with many different operators. For the style here, I'm passing operators as single characters, so there's already an indirect mapping from C operators to our function's operators. This is an x-macro table.
#define OPERATORS(_) \
/* f F id */ \
_('+',+,0) \
_('*',*,1) \
_('=',==,1) \
/**/
#define binop(X,F,Y) (binop)(X,*#F,Y)
arr (binop)(arr x, char f, arr y); /* perform binary operation F upon corresponding elements of vectors X and Y */
The extra element in the table is for the reduce
function for the case of a null vector argument. In that case, reduce should return the operator's identity element, 0 for +
, 1 for *
.
#define reduce(F,X) (reduce)(*#F,X)
int (reduce)(char f, arr a); /* perform binary operation F upon adjacent elements of vector X, right to left,
reducing vector to a single value */
So the binop
does a loop and a switch on the operator character.
/* perform binary operation F upon corresponding elements of vectors X and Y */
#define BINOP(f,F,id) case f: *elem(z,i) = *elem(x,i) F *elem(y,i); break;
arr (binop)(arr x, char f, arr y){
arr z=copy(x);
int n=x->dims[0];
int i;
for (i=0; i<n; i++){
switch(f){
OPERATORS(BINOP)
}
}
return z;
}
#undef BINOP
And the reduce function does a backwards loop if there are enough elements, having set the initial value to the last element if there was one, having preset the initial value to the operator's identity element.
/* perform binary operation F upon adjacent elements of vector X, right to left,
reducing vector to a single value */
#define REDID(f,F,id) case f: x = id; break;
#define REDOP(f,F,id) case f: x = *elem(a,i) F x; break;
int (reduce)(char f, arr a){
int n = a->dims[0];
int x;
int i;
switch(f){
OPERATORS(REDID)
}
if (n) {
x=*elem(a,n-1);
for (i=n-2;i>=0;i--){
switch(f){
OPERATORS(REDOP)
}
}
}
return x;
}
#undef REDID
#undef REDOP
And with these tools, inner product can be implemented in a higher-level manner.
/* perform a (2D) matrix multiplication upon rows of x and columns of y
using operations F and G.
Z = X F.G Y
Z[i,j] = F/ X[i,*] G Y'[j,*]
more generally,
perform an inner product on arguments of compatible dimension.
Z = X[A;B;C;D;E;F] +.* Y[G;H;I;J;K] |(F = G)
Z[A;B;C;D;E;H;I;J;K] = +/ X[A;B;C;D;E;*] * Y[*;H;I;J;K]
*/
arr (matmul)(arr x, char f, char g, arr y){
int i,j;
arr xdims = cast(x->dims,1,x->rank);
arr ydims = cast(y->dims,1,y->rank);
xdims->dims[0]--;
ydims->dims[0]--;
ydims->data++;
arr z=arraya(x->rank+y->rank-2,catv(xdims,ydims)->data);
int datasz = productdims(z->rank,z->dims);
int k=y->dims[0];
arr xs = NULL;
arr ys = NULL;
for (i=0; i<datasz; i++){
int idx[x->rank+y->rank];
vector_index(i,z->dims,z->rank,idx);
int *xdex=idx;
int *ydex=idx+x->rank-1;
memmove(ydex+1,ydex,y->rank);
xdex[x->rank-1] = -1;
free(xs);
free(ys);
xs = slicea(x,xdex);
ys = slicea(y,ydex);
z->data[i] = (reduce)(f,(binop)(xs,g,ys));
}
free(xs);
free(ys);
free(xdims);
free(ydims);
return z;
}
This function also uses the functions cast
which presents a varargs interface to casta
.
/* create an array header to access existing data in multidimensional layout */
arr cast(int *data, int rank, ...){
va_list ap;
int dims[rank];
va_start(ap,rank);
loaddimsv(rank,dims,ap);
va_end(ap);
return casta(data, rank, dims);
}
And it also uses vector_index
to convert a 1D index into an nD vector of indices.
/* compute vector index list for ravel index ind */
int *vector_index(int ind, int *dims, int n, int *vec){
int i,t=ind, *z=vec;
for (i=0; i<n; i++){
z[n-1-i] = t % dims[n-1-i];
t /= dims[n-1-i];
}
return z;
}
github file. Additional supporting functions are also in the github file.
This Q/A pair is part of a series of related issues which arose in implementing inca an interpreter for the APL language written in C. Others: How can I work with dynamically-allocated arbitrary-dimensional arrays? , and How to pass a C math operator (+-*/%) into a function result=mathfunc(x,+,y);? . Some of this material was previously posted to comp.lang.c. More background in comp.lang.apl.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With