Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a data structure with these characteristics?

I'm looking for a data structure that would allow me to store an M-by-N 2D matrix of values contiguously in memory, such that the distance in memory between any two points approximates the Euclidean distance between those points in the matrix. That is, in a typical row-major representation as a one-dimensional array of M * N elements, the memory distance differs between adjacent cells in the same row (1) and adjacent cells in neighbouring rows (N).

I'd like a data structure that reduces or removes this difference. Really, the name of such a structure is sufficient—I can implement it myself. If answers happen to refer to libraries for this sort of thing, that's also acceptable, but they should be usable with C++.

I have an application that needs to perform fast image convolutions without hardware acceleration, and though I'm aware of the usual optimisation techniques for this sort of thing, I feel a specialised data structure or data ordering could improve performance.

like image 550
Jon Purdy Avatar asked Aug 23 '10 17:08

Jon Purdy


2 Answers

Given the requirement that you want to store the values contiguously in memory, I'd strongly suggest you research space-filling curves, especially Hilbert curves.

To give a bit of context, such curves are sometimes used in database indexes to improve the locality of multidimensional range queries (e.g., "find all items with x/y coordinates in this rectangle"), thereby aiming to reduce the number of distinct pages accessed. A bit similar to the R-trees that have been suggested here already.

Either way, it looks that you're bound to an M*N array of values in memory, so the whole question is about how to arrange the values in that array, I figure. (Unless I misunderstood the question.)

So in fact, such orderings would probably still only change the characteristics of distance distribution.. average distance for any two randomly chosen points from the matrix should not change, so I have to agree with Oli there. Potential benefit depends largely on your specific use case, I suppose.

like image 181
ig2r Avatar answered Sep 28 '22 17:09

ig2r


I would guess "no"! And if the answer happens to be "yes", then it's almost certainly so irregular that it'll be way slower for a convolution-type operation.

EDIT

To qualify my guess, take an example. Let's say we store a[0][0] first. We want a[k][0] and a[0][k] to be similar distances, and proportional to k, so we might choose to interleave the storage of first row and first column (i.e. a[0][0], a[1][0], a[0][1], a[2][0], a[0][2], etc.) But how do we now do the same for e.g. a[1][0]? All the locations near it in memory are now taken up by stuff that's near a[0][0].

Whilst there are other possibilities than my example, I'd wager that you always end up with this kind of problem.

EDIT

If your data is sparse, then there may be scope to do something clever (re Cubbi's suggestion of R-trees). However, it'll still require irregular access and pointer chasing, so will be significantly slower than straightforward convolution for any given number of points.

like image 37
Oliver Charlesworth Avatar answered Sep 28 '22 18:09

Oliver Charlesworth