Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimized "Multidimensional" Arrays in Ruby

From birth I've always been taught to avoid nested arrays like the plague for performance and internal data structure reasons. So I'm trying to find a good solution for optimized multidimensional data structures in Ruby.

The typical solution would involve maybe using a 1D array and accessing each one by x*width + y.

Ruby has the ability to overload the [] operator, so perhaps a good solution would involve using multi_dimensional_array[2,4] or even use a splat to support arbitrary dimension amounts. (But really, I only need two dimensions)

Is there a library/gem already out there for this? If not, what would the best way to go about writing this be?

My nested-array-lookups are the bottleneck right now of my rather computationally-intensive script, so this is something that is important and not a case of premature optimization.

If it helps, my script uses mostly random lookups and less traversals.

like image 854
Justin L. Avatar asked Feb 26 '23 06:02

Justin L.


2 Answers

narray

NArray is an Numerical N-dimensional Array class. Supported element types are 1/2/4-byte Integer, single/double-precision Real/Complex, and Ruby Object. This extension library incorporates fast calculation and easy manipulation of large numerical arrays into the Ruby language. NArray has features similar to NumPy, but NArray has vector and matrix subclasses.

like image 154
John La Rooy Avatar answered Mar 08 '23 11:03

John La Rooy


You could inherit from Array and create your own class that emulated a multi-dimensional array (but was internally a simple 1-dimensional array). You may see some speedup from it, but it's hard to say without writing the code both ways and profiling it.

You may also want to experiment with the NArray class.

All that aside, your nested array lookups might not be the real bottleneck that they appear to be. On several occasions, I have had the same problem and then later found out that re-writing some of my logic cleared up the bottleneck. It's more than just speeding up the nested lookups, it's about minimizing the number of lookups needed. Each "random access" in an n-dimensional array takes n lookups (one per nested array level). You can reduce this by iterating through the dimensions using code like:

array.each {|x|
    x.each {|y|
        y.each {|z|
            ...
        }
    }
}

This allows you to do a single lookup in the first dimension and then access everything "behind" it, then a single lookup in the second dimension, etc etc. This will result in significantly fewer lookups than randomly accessing elements.

If you need random element access, you may want to try using a hash instead. You can take the array indices, concatenate them together as a string, and use that as the hash key (for example, array[12][0][3] becomes hash['0012_0000_0003']). This may result in faster "random access" times, but you'd want to profile it to know for certain.

Any chance you can post some of your problematic code? Knowing the problem code will make it easier for us to recommend a solution.

like image 25
bta Avatar answered Mar 08 '23 10:03

bta