Conditional summation in python




I have a numpy 2d array (8000x7200). I want to count the number of cells having a value greater than a specified threshold. I tried to do this using a double loop, but it takes a lot of time. Is there a way to perform this calculation quickly?

Assume your variables are defined as

a = np.random.rand(8000, 7200)
threshold = .5

Then use sum
*(a > threshold) is a boolean array indicating every instance of a cell being greater than some threshold. Since boolean values are a sub-class of int, with False as zero and True as one, we can easily sum them up. numpys sum sums over the entire array by default.

(a > threshold).sum()
