Let's say I have a 2D accumulator array in java int[][] array
. The array could look like this:
(x and z axes represent indexes in the array, y axis represents values - these are images of an int[56][56]
with values from 0 ~ 4500)
or
What I need to do is find peaks in the array - there are 2 peaks in the first one and 8 peaks in the second array. These peaks are always 'obvious' (there's always a gap between peaks), but they don't have to be similar like on these images, they can be more or less random - these images are not based on the real data, just samples. The real array can have size like 5000x5000 with peaks from thousands to several hundred thousands... The algorithm has to be universal, I don't know how big the array or peaks can be, I also don't know how many peaks there are. But I do know some sort of threshold - that the peaks can't be smaller than a given value.
The problem is, that one peak can consist of several smaller peaks nearby (first image), the height can be quite random and also the size can be significantly different within one array (size - I mean the number of units it takes in the array - one peak can consist from 6 units and other from 90). It also has to be fast (all done in 1 iteration), the array can be really big.
Any help is appreciated - I don't expect code from you, just the right idea :) Thanks!
He's not interested in estimating the global maximum using some sort of optimization heuristic - he just wants to find the maximum values within each of a number of separate clusters.
These peaks are always 'obvious' (there's always a gap between peaks)
Based on your images, I assume you mean there's always some 0
-values separating clusters? If that's the case, you can use a simple flood-fill to identify the clusters. You can also keep track of each cluster's maximum while doing the flood-fill, so you both identify the clusters and find their maximum simultaneously.
This is also as fast as you can get, without relying on heuristics (which could return the wrong answer), since the maximum of each cluster could potentially be any value in the cluster, so you have to check them all at least once.
Note that this will iterate through every item in the array. This is also necessary, since (from the information you've given us) it's potentially possible for any single item in the array to be its own cluster (which would also make it a peak). With around 25 million items in the array, this should only take a few seconds on a modern computer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With