I have a large 3D numpy array (1024 x 1024 x 1024), and I need to find the regions around local maxima so that all neighbouring points with a value greater than, e.g., 50% of the local maximum, are recursively clustered in the same region. An iterative algorithm would be:
continue
looping over the next local maximum.For such a huge array this algorithm is prohibitive, so I was looking for some vectorized solution with the Python libraries.
More specifically, for a 2D array such as
0.0 1.6 0.0 0.0 0.0
0.0 2.0 1.0 0.0 5.0
1.6 3.0 1.0 0.0 4.6
0.0 0.0 0.0 9.0 4.6
there would be two such regions:
5.0
4.6
9.0 4.6
and
1.6
2.0
1.6 3.0
Intuitively, I am looking for the "mountains" around local maxima, where a "mountain" is defined by a contour level that is not absolute but relative to the local maxima.
I have tried using scipy.ndimage
, which is quite useful to first find the local maxima. But I do not know how to get the regions around them. I also had a look at standard clustering algorithms and image processing techniques such as blob detection or local thresholding, but none seemed to reproduce this problem.
Any suggestion is appreciated.
Thanks in advance,
EDIT: thanks to taw, a solution is the following
import math
import numpy as np
def soln(data, maxind):
test = np.pad(data,(1,1),'constant',constant_values = [-math.inf,-math.inf])
regionlist = {} # Dictionary with the results
for region, ind in enumerate(maxind): # Loop over all maxima
M = test[ind[0]+1, ind[1]+1, ind[2]+1] # Value of the maximum
if M == -np.inf:
continue
regionlist[region] = set()
regionlist[region].add(ind)
test[ind[0]+1, ind[1]+1, ind[2]+1] = -math.inf # All points that are added to the results are set to -infinity
neighbors = set()
neighbors.add((ind[0]+1, ind[1]+1, ind[2]+1))
while len(neighbors)>0: #create region iteratively
newneighbors = set() # This will contain the new elements in the region
for i in neighbors:
values = test[i[0]-1:i[0]+2, i[1]-1:i[1]+2, i[2]-1:i[2]+2] # Values of neighbours
valuesmask = values > .5*M # Neighbours that fall in region
list1 = range(i[0]-2, i[0]+1)
list2 = range(i[1]-2, i[1]+1)
list3 = range(i[2]-2, i[2]+1)
indlist = list(itertools.product(list1, list2, list3)) # 3-D list with coordinates of neighbours
for count,j in enumerate(valuesmask):
if j:
newneighbors.add(indlist[count])
#update iteration
newneighbors = newneighbors.difference(neighbors) # Remove all elements that were already iterated over and added to regionlist
regionlist[region].update(newneighbors) # Add the new elements in the region to regionlist
neighbors = set((x[0]-1, x[1]-1, x[2]-1) for x in newneighbors) # In the next iteration, iterate only over new elements in the region
for i in newneighbors:
test[i[0]+1, i[1]+1, i[2]+1] = -math.inf #set values to -inf after added to region
return regionlist
I'm not sure how "local maxima" is being defined, or what functions in scipy.ndimage you are using to get them. Here is a function that will give the set of indices belonging to each region (returns indices, not the values). Cost looks like O(number of points which will be assigned to regions). The constant depends on the dimension of the array. I don't think it's possible to do better than this (in terms of the complexity).
Also this solution works on 2d arrays.
import math
import numpy as np
test = np.array([[0, 1.6, 0, 0, 0,], [0, 2, 1,0,5],[1.6,3,1,0,4.6],[0,0,0,9,4.6]])
maxind = [(3,3),(2,1)] #indices of maxima
def soln(data, maxind):
test = np.pad(data,(1,1),'constant',constant_values = [-math.inf,-math.inf])
regionlist = {}
for region,ind in enumerate(maxind): #all maxima
regionlist[region] = set()
regionlist[region].add(ind)
M = test[ind[0]+1,ind[1]+1]
test[ind[0]+1,ind[1]+1] = -math.inf
neighbors = set()
neighbors.add((ind[0]+1,ind[1]+1))
while len(neighbors)>0: #create region iteratively
newneighbors = set()
for i in neighbors:
values = test[i[0]-1:i[0]+2,i[1]-1:i[1]+2]
valuesmask = values.flatten() > .5*M
list1 = np.repeat(list(range(i[0]-2,i[0]+1)),3)
list2 = np.tile(list(range(i[1]-2,i[1]+1)), 3)
indlist = list(zip(list1,list2))
for count,j in enumerate(valuesmask):
if j:
newneighbors.add(indlist[count])
#update iteration
newneighbors = newneighbors.difference(neighbors)
regionlist[region].update(newneighbors)
neighbors = newneighbors
for i in newneighbors:
test[i[0]+1,i[1]+1] = -math.inf #set values to -inf after added to region
return regionlist
regionlist = soln(test, maxind)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With