Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

find iso-cost points on a 3d grid efficiently with minimum costing of points

I have a 3d grid where in each point (x,y,z) on the grid is associated with a cost value. The cost of any point (x,y,z) is not known in advance. To know the cost, we need to make a complex query which is really expensive. One thing we know about the object is that cost is monotonically non-decreasing in all 3 dimensions.

Now given a cost C, I need to find the points (x,y,z) on the surface which have cost C. This has to be done by costing only bare minimum. How to solve my problem?

When I searched online, I am getting contour identification related techniques but all these techniques assume all point's cost is known in advance like Marching cubes method etc. In my case major metric is the number of points costed should be minimum.

It would be helpful if some one can suggest a way to get approximate locations at least if not exact.

like image 230
CRM Avatar asked Dec 02 '14 21:12

CRM


3 Answers

This is not an answer per se, just slightly generalized example C code. (The code was too long to include verbatim.)

The basic implementation is in grid.h (pastebin link).

In it, I've tried to make a distinction between grid coordinates (0 ≤ x, y, zsize-1) and cell coordinates (0 ≤ x, y, zsize-2). In particular, note the span type. Each cell spans a range of values: either interpolated, or the discrete set of the samples at the eight corners of the cell. Because this example uses linear interpolation to determine where within each cell the isosurface intersects the edges or a diagonal, I assume continuous spans.

I didn't realize how important cells spanning values is for edge cases, before I implemented this example code. That is why the OP and I discussed the edge cases in the comments to my other answer, and why the logic outlined in my other answer alone does not handle the edge cases correctly.

Since OP's particular case is not that common/interesting, this example is much more generic (and therefore quite unoptimized for the OP's case). In fact, this example only requires that the function has no local minima or maxima (saddle points and constant regions are allowed); just one minimum and one maximum within the gridded region. Minimum and maximum do not need to be point-like; they can be continous regions.

As such, at grid generation time, we do not know which cells contain the minimum and maximum. (In OP's case, the scalar field is monotonically non-decreasing and limited to the positive octant, so the minimum is at 0,0,0 and maximum at size-1,size-1,size-1.)

To find the minimum and maximum, I implemented two functions, that start from the best corner in the grid (having the smallest or greatest sample value). grid_maximum_cell() walks non-decreasing cells, and grid_minimum_cell() walks non-increasing cells. Since the scalar field is sampled, we implicitly assume it is continuous. As long as there are no local maxima or minima where the walk might stop, the walk will reach the correct cell in relatively few samples. (This search could be optimized much further, though. Consider these two functions just starting points for your own implementation. The OP does not need these at all, of course.)

(Actually, the requirement for the sampled scalar field is that each isosurface is continous, and that all isosurfaces intersect the line drawn from the minimum and maximum cells found using the above two functions.)

The function grid_isosurface() can be used to locate the cells the desired isosurface (field value) passes through. The last parameter is a function pointer. That function is called once for each cell the isosurface passes through. (Note the indexing order for the corner samples, [x][y][z].)

grid_isosurface() locates an initial cell the desired isosurface passes through using a binary search (on the line from the cell containing the minimum sample, to the cell containg the maximum sample). It then traces the surface, using the flood-fill-like algorithm outlined in my answer.

For an example, grid.c (pastebin link) uses the above include file, to evaluate the scalar field

f(x, y, z) = x3 + y3 + z3 + x + y - 0.125·(x·y + x·z + y·z + x·y·z).

On my Linux machine, I compiled and ran the example using

gcc -Wall -std=c99 -Wno-unused -O2 grid.c -o isosurface
./isosurface 50 -1.0 1.0 0.0 > out-0.0
./isosurface 50 -1.0 1.0 0.5 > out-0.5
./isosurface 50 -1.0 1.0 1.0 > out-1.0

and used Gnuplot to plot out the three isosurfaces:

splot "out-0.0" u 1:2:3 notitle w dots, "out-0.5" u 1:2:3 notitle w dots, "out-1.0" u notitle w dots

which leads to this pretty nice point cloud (rotatable in Gnuplot): Three isosurfaces as point clouds

When the grid is initially generated, 14 samples are taken to locate the maximum and minimum cells. Tracing the isosurfaces required additional 18024, 18199, and 16953 samples, respectively; note that much fewer samples are needed for the second and further isosurfaces, if you do them consecutively on the same grid.

The total grid above contains 51×51×51 = 132651 samples, so tracing one isosurface required about 13% of the grid points to be sampled. For a 101×101×101 grid, the samples needed drops down to about 7%; for a 201×201×201 grid, down to 3.5%; for a 501x501x501 grid, to 1.4% (1.7M out of 125.75M samples).

None of this code is optimized for OP's case, nor optimized in general. A sample cache is used to minimize the number of samples needed in general, but the grid_isosurface() isosurface walking function, and the initial grid_minimum_cell() and grid_maximum_cell() functions can be modified to require slightly fewer samples. For larger grids, I don't expect the optimizations to make much of a difference, but for very small grids and very slow functions to evaluate, it might be worthwhile.

If the intent is to generate a polygon mesh for each isosurface, I recommend generating each polygon in the callback function, not from the overall generated point cloud. Using the edge/diagonal intersections like in the above example program, you get all the vertices for the polygon spanning that cell (no caches or such are needed). All you need is to order the edge intersection points correctly.

Questions? Comments? Bug fixes? Suggestions?

like image 165
Nominal Animal Avatar answered Nov 20 '22 17:11

Nominal Animal


You should look into this article which talks about the 2-dimensional case and gives you a great insight into the different methodologies: http://leetcode.com/2010/10/searching-2d-sorted-matrix.html

In my opinion, the step-wise linear search (in part II there) would be a great first step for you because it's very easy to apply to the 3-d case and it really doesn't require a lot of experience to understand.
Because this is so straightforward and still very efficient, I would go with this and see if it fits your needs for the kind of data you're working with in 3-d.

However, if your only goal is performance, then you should apply the binary partition to 3-d. This gets a little bit more complex because the 'binary partition' he talks about essentially becomes a 'binary plane partition'.
So you don't have a line partitioning your matrix into 2 possible smaller matrices.
Instead you have a plane partitioning your cube into 2 possible smaller cubes.
To make the search in that plane (or matrix) efficient, you would first have to implement one of his methods :).
Then you repeat everything with the smaller cubes.
Keep in mind that implementing this in a very efficient way (i.e. keeping memory access in mind) is not trivial.

like image 2
Matt Ko Avatar answered Nov 20 '22 16:11

Matt Ko


I'll give this answer in an effort to try to minimize the number of costs calculated. Matt Ko links to a good solution, but it assumes a cheap cost function and a matrix-based data, which you don't seem to have either of. The approach I give requires much closer to O(log N + k) calls to the cost function, where k is the number of points with the desired cost. Note that this algorithm with some performance optimiztions could be made to be O(N) on a 3D matrix with little chance to performance cost function call wise, though it's a fair bit more complicated.

The psudeocode, which is based on techniques used in quickselect looks like this:

While there are still points under considerations:
    Find the ideal pivot point and calculate it's cost
    Remove the pivot from the point set
    If the cost is the desired cost then:
        Add the pivot to the solution set
    Else:
        Separate the points into 3 groups:
             G1. Those that are in in the pivot's octant `VII`
             G2. Those have the same x, y, or z of the pivot
             G3. Those that are not in the pivot's octant `VII`
             # Note this can be done in O(N)

        If all points are in group 2:
            Use 1D binary searches in each dimension to find points with the desired cost
        Else:
            Compute the cost of the pivot
            Keep all points in group 2
            If the pivot cost is greater than desired:
                Keep only the points in group 1
            Else:
                Keep only the points in group 3

The pivot selected based on the points inside and outside of octant VII from that line. Points on the any of the 3 lines that form the octants are dealt with later if needed (G2).

The ideal pivot point is the such that the number of points in group 1 (G1) and group 3 (G3) are as close to equal as possible. To look at it mathematically would be along the lines of maximizing the larger of the two over the smaller of the two, or maximize(max(|G1|,|G3|) / min(|G1|,|G3|) ). Even a fairly naive algorithm looking for the ideal pivot point can find it in O(N^2) (an O(N log N) algorithm likely exists), but it takes O(N^3) to compute the cost of the ideal pivot after it's found.

After the ideal pivot is found and it's cost computed, each iteration should see on average roughly half the remaining points discarded, which again, results in only O(log N + k) calls to the cost function.

Final Note:

In retrospect, I'm not sure special consideration for group 2 is actually required as it's probably in group 3, but I'm not 100% sure. However, separating it out doesn't seem to change the Big O, so I didn't see a need to change it, though doing so would simplify the algorithm slightly.

like image 2
Nuclearman Avatar answered Nov 20 '22 16:11

Nuclearman