Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm to find peaks in 2D array

Let's say I have a 2D accumulator array in java int[][] array. The array could look like this:

(x and z axes represent indexes in the array, y axis represents values - these are images of an int[56][56] with values from 0 ~ 4500) array sample 1

or

array sample 1

What I need to do is find peaks in the array - there are 2 peaks in the first one and 8 peaks in the second array. These peaks are always 'obvious' (there's always a gap between peaks), but they don't have to be similar like on these images, they can be more or less random - these images are not based on the real data, just samples. The real array can have size like 5000x5000 with peaks from thousands to several hundred thousands... The algorithm has to be universal, I don't know how big the array or peaks can be, I also don't know how many peaks there are. But I do know some sort of threshold - that the peaks can't be smaller than a given value.

The problem is, that one peak can consist of several smaller peaks nearby (first image), the height can be quite random and also the size can be significantly different within one array (size - I mean the number of units it takes in the array - one peak can consist from 6 units and other from 90). It also has to be fast (all done in 1 iteration), the array can be really big.

Any help is appreciated - I don't expect code from you, just the right idea :) Thanks!


edit: You asked about the domain - but it's quite complicated and imho it can't help with the problem. It's actually an array of ArrayLists with 3D points, like ArrayList< Point3D >[][] and the value in question is the size of the ArrayList. Each peak contains points that belong to one cluster (plane, in this case) - this array is a result of an algorithm, that segments a pointcloud . I need to find the highest value in the peak so I can fit the points from the 'biggest' arraylist to a plane, compute some parameters from it and than properly cluster most of the points from the peak.
like image 811
Jaa-c Avatar asked Mar 06 '12 15:03

Jaa-c


1 Answers

He's not interested in estimating the global maximum using some sort of optimization heuristic - he just wants to find the maximum values within each of a number of separate clusters.

These peaks are always 'obvious' (there's always a gap between peaks)

Based on your images, I assume you mean there's always some 0-values separating clusters? If that's the case, you can use a simple flood-fill to identify the clusters. You can also keep track of each cluster's maximum while doing the flood-fill, so you both identify the clusters and find their maximum simultaneously.

This is also as fast as you can get, without relying on heuristics (which could return the wrong answer), since the maximum of each cluster could potentially be any value in the cluster, so you have to check them all at least once.


Note that this will iterate through every item in the array. This is also necessary, since (from the information you've given us) it's potentially possible for any single item in the array to be its own cluster (which would also make it a peak). With around 25 million items in the array, this should only take a few seconds on a modern computer.

like image 197
BlueRaja - Danny Pflughoeft Avatar answered Sep 30 '22 14:09

BlueRaja - Danny Pflughoeft