Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the fastest way to find deepest path in a 3D array?

I've been trying to find solution to my problem for more than a week and I couldn't find out anything better than a milion iterations prog, so I think it's time to ask someone to help me.

I've got a 3D array. Let's say, we're talking about the ground and the first layer is a surface. Another layers are floors below the ground. I have to find deepest path's length, count of isolated caves underground and the size of the biggest cave.

Here's the visualisation of my problem.

Input:
5 5 5 // x, y, z
xxxxx
oxxxx
xxxxx
xoxxo
ooxxx

xxxxx
xxoxx

and so...

Output:
5 // deepest path - starting from the surface
22 // size of the biggest cave
3 // number of izolated caves (red ones) (izolated - cave that doesn't reach the surface)

Note, that even though red cell on the 2nd floor is placed next to green one, It's not the same cave because it's placed diagonally and that doesn't count. I've been told that the best way to do this, might be using recursive algorithm "divide and rule" however I don't really know how could it look like.

like image 908
Patryk Avatar asked Oct 19 '12 16:10

Patryk


4 Answers

I think you should be able to do it in O(N). When you parse your input, assign each node a 'caveNumber' initialized to 0. Set it to a valid number whenever you visit a cave:

CaveCount = 0, IsolatedCaveCount=0
AllSizes = new Vector.
For each node, 
   ProcessNode(size:0,depth:0);

ProcessNode(size,depth):
   If node.isCave and !node.caveNumber 
       if (size==0) ++CaveCount
       if (size==0 and depth!=0) IsolatedCaveCount++
       node.caveNumber = CaveCount
       AllSizes[CaveCount]++
       For each neighbor of node, 
            if (goingDeeper) depth++
            ProcessNode(size+1, depth).

You will visit each node 7 times at worst case: once from the outer loop, and possibly once from each of its six neighbors. But you'll only work on each one once, since after that the caveNumber is set, and you ignore it.

You can do the depth tracking by adding a depth parameter to the recursive ProcessNode call, and only incrementing it when visiting a lower neighbor.

like image 159
AShelly Avatar answered Nov 04 '22 14:11

AShelly


The solution shown below (as a python program) runs in time O(n lg*(n)), where lg*(n) is the nearly-constant iterated-log function often associated with union operations in disjoint-set forests.

In the first pass through all cells, the program creates a disjoint-set forest, using routines called makeset(), findset(), link(), and union(), just as explained in section 22.3 (Disjoint-set forests) of edition 1 of Cormen/Leiserson/Rivest. In later passes through the cells, it counts the number of members of each disjoint forest, checks the depth, etc. The first pass runs in time O(n lg*(n)) and later passes run in time O(n) but by simple program changes some of the passes could run in O(c) or O(b) for c caves with a total of b cells.

Note that the code shown below is not subject to the error contained in a previous answer, where the previous answer's pseudo-code contains the line
if (size==0 and depth!=0) IsolatedCaveCount++
The error in that line is that a cave with a connection to the surface might have underground rising branches, which the other answer would erroneously add to its total of isolated caves.

The code shown below produces the following output:
Deepest: 5 Largest: 22 Isolated: 3
(Note that the count of 24 shown in your diagram should be 22, from 4+9+9.)

v=[0b0000010000000000100111000,   # Cave map
   0b0000000100000110001100000,
   0b0000000000000001100111000,
   0b0000000000111001110111100,
   0b0000100000111001110111101]
nx, ny, nz = 5, 5, 5
inlay, ncells = (nx+1) * ny,  (nx+1) * ny * nz
masks = []
for r in range(ny):
    masks += [2**j for j in range(nx*ny)][nx*r:nx*r+nx] + [0]
p = [-1 for i in range(ncells)]  # parent links
r = [ 0 for i in range(ncells)]  # rank
c = [ 0 for i in range(ncells)]  # forest-size counts
d = [-1 for i in range(ncells)]  # depths

def makeset(x): # Ref: CLR 22.3, Disjoint-set forests
    p[x] = x
    r[x] = 0
def findset(x):
    if x != p[x]:
        p[x] = findset(p[x])
    return p[x]
def link(x,y):
    if r[x] > r[y]:
        p[y] = x
    else:
        p[x] = y
        if r[x] == r[y]:
            r[y] += 1
def union(x,y):
    link(findset(x), findset(y))

fa = 0                          # fa = floor above
bc = 0                          # bc = floor's base cell #
for f in v:                     # f = current-floor map
    cn = bc-1                   # cn = cell#
    ml = 0
    for m in masks:
        cn += 1
        if m & f:
            makeset(cn)
            if ml & f:
                union(cn, cn-1)
            mr = m>>nx
            if mr and mr & f:
                union(cn, cn-nx-1)
            if m & fa:
                union(cn, cn-inlay)
        ml = m
    bc += inlay
    fa = f

for i in range(inlay):
    findset(i)
    if p[i] > -1:
        d[p[i]] = 0
for i in range(ncells):
    if p[i] > -1:
        c[findset(i)] += 1
        if d[p[i]] > -1:
            d[p[i]] = max(d[p[i]], i//inlay)
isola = len([i for i in range(ncells) if c[i] > 0 and d[p[i]] < 0])
print "Deepest:", 1+max(d), "  Largest:", max(c), "  Isolated:", isola
like image 2
James Waldby - jwpat7 Avatar answered Nov 04 '22 13:11

James Waldby - jwpat7


It sounds like you're solving a "connected components" problem. If your 3D array can be converted to a bit array (e.g. 0 = bedrock, 1 = cave, or vice versa) then you can apply a technique used in image processing to find the number and dimensions of either the foreground or background.

Typically this algorithm is applied in 2D images to find "connected components" or "blobs" of the same color. If possible, find a "single pass" algorithm:

http://en.wikipedia.org/wiki/Connected-component_labeling

The same technique can be applied to 3D data. Googling "connected components 3D" will yield links like this one:

http://www.ecse.rpi.edu/Homepages/wrf/pmwiki/pmwiki.php/Research/ConnectedComponents

Once the algorithm has finished processing your 3D array, you'll have a list of labeled, connected regions, and each region will be a list of voxels (volume elements analogous to image pixels). You can then analyze each labeled region to determine volume, closeness to the surface, height, etc.

Implementing these algorithms can be a little tricky, and you might want to try a 2D implementation first. Thought it might not be as efficient as you like, you could create a 3D connected component labeling algorithm by applying a 2D algorithm iteratively to each layer and then relabeling the connected regions from the top layer to the bottom layer:

  1. For layer 0, find all connected regions using the 2D connected component algorithm
  2. For layer 1, find all connected regions.
  3. If any labeled pixel in layer 0 sits directly over a labeled pixel in layer 1, change all the labels in layer 1 to the label in layer 0.
  4. Apply this labeling technique iteratively through the stack until you reach layer N.

One important considering in connected component labeling is how one considers regions to be connected. In a 2D image (or 2D array) of bits, we can consider either the "4-connected" region of neighbor elements

X 1 X
1 C 1
X 1 X

where "C" is the center element, "1" indicates neighbors that would be considered connected, and "X" are adjacent neighbors that we do not consider connected. Another option is to consider "8-connected neighbors":

1 1 1
1 C 1
1 1 1

That is, every element adjacent to a central pixel is considered connected. At first this may sound like the better option. In real-world 2D image data a chessboard pattern of noise or diagonal string of single noise pixels will be detected as a connected region, so we typically test for 4-connectivity.

For 3D data you can consider either 6-connectivity or 26-connectivity: 6-connectivity considers only the neighbor pixels that share a full cube face with the center voxel, and 26-connectivity considers every adjacent pixel around the center voxel. You mention that "diagonally placed" doesn't count, so 6-connectivity should suffice.

like image 2
Rethunk Avatar answered Nov 04 '22 15:11

Rethunk


You can observe it as a graph where (non-diagonal) adjacent elements are connected if they both empty (part of a cave). Note that you don't have to convert it to a graph, you can use normal 3d array representation.

Finding caves is the same task as finding the connected components in a graph (O(N)) and the size of a cave is the number of nodes of that component.

like image 1
Karoly Horvath Avatar answered Nov 04 '22 13:11

Karoly Horvath