Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is calculating the index in an array more efficient than letting the compiler do it?

I'm trying to generalize a neural network function to arbitrarily many layers, and so I need multiple matrices to hold the weights for each neuron in each layer. I was originally explicitly declaring matrix objects in R to hold my weights for each layer. Instead of having one matrix per layer, I thought of a way (not saying it's original), to store all of my weights in a single array and defined an "indexing function" to map a weight to its appropriate index in the array.

I defined the function as follows:

where is the k-th weight of the j-th neuron in the i-th layer and L(r) is the number of neurons in layer r. After writing these definitions, I realize that stackoverflow doesn't allow latex like mathoverflow which is unfortunate. Now the question is: Is it more efficient to compute the index of my weights in this way, or is actually less efficient? After looking up how indices are computed for arrays in general, this is essentially what is done on compilation anyway if I just kept a matrix in each layer holding the weights, so it seems like I may just be making my code overly complicated and harder to understand if there's no difference in time efficiency.

like image 614
TheBluegrassMathematician Avatar asked Mar 05 '18 13:03

TheBluegrassMathematician


People also ask

Are pointer references more efficient than array indexes?

Pointer arithmetic is actually about 30% faster than using array indexes.

Why do we use index in array?

The index indicates the position of the element within the array (starting from 1) and is either a number or a field containing a number.

What is the complexity of finding an array element based on index?

Arrays. Because it takes a single step to access an item of an array via its index, or add/remove an item at the end of an array, the complexity for accessing, pushing or popping a value in an array is O(1). Whereas, linearly searching through an array via its index, as seen before, has a complexity of O(n).

Why is pointer code faster than the corresponding array code?

The pointer can be used to access the array elements, accessing the whole array using pointer arithmetic, makes the accessing faster. The main difference between Array and Pointers is the fixed size of the memory block. When Arrays are created the fixed size of the memory block is allocated.


1 Answers

There are many factors to take into consideration in each of the approaches. I'm not familiar with R but I'm assuming matrices' buffers are represented as one-dimensional arrays in memory. (Even if they are written as two dimensional arrays in the underlying C implementation the compiler stores it as one-dimensional array in memory)

The overall outline of memory operations are:

  1. Case: Several matrices per layers

    • Allocation of matrices: allocationmatrix
    • Accessing of indices: accesingindices
  2. Case: One matrix for all layers + index calculation

    • Allocation of matrix cost: onematrixallocation
    • Accesing each of the indices cost: accesingindicies
    • Function cost: functioncost

We can clearly see that the second case, scales better, even though there's the additional cost of the function call.

Having said that, in general having a statically allocated array with all the weights for all the layers, should be faster.

In most cases, computers's bottleneck is memory bandwidth, and the best way to counteract this is to minimize the number of memory accesses.

With this in mind there's another more primitive reason why the 2nd approach will probably be faster: Caches.

Here's a good explanation of the performance difference in accesing a two-dimensional array in a loop by Good Ol' Bob Martin

TL; DR: Caches take advantage of the principle of locality, and therefore, having memory accesses spatially close to each other (as you would in one single array and accessing them in a cache-friendly way as explained in Bob Martin's answer) renders better performance than having them spatially separated (having them in several distinct arrays).

PS: I also recommend to benchmark both approaches and compare, since these nuances regarding the cache are machine-dependent. It might be the case that the Dataset/NN is small enough to fit completely in RAM or even in cache? in a very powerful server.

like image 112
chibby0ne Avatar answered Nov 21 '22 15:11

chibby0ne