Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Conditional numpy.cumsum?

I'm very new to python and numpy, so sorry if I misuse some terminology.

I have converted a raster to a 2D numpy array in the hopes of doing calculations on it quickly and efficiently.

  • I need to get the cumulative sum across a numpy array such that, for each value, I generate the sum of all values that are less than or equal to that, and write that value to a new array. I need to cycle through the entire array that way.

  • I also need to scale the output between 1 and 100, but that seems
    more straightforward.

Attempt to clarify by example:

array([[ 4,  1  ,  3   ,  2] dtype=float32)

I would want the output values (just doing first row by hand) to read:

array([[ 10,  1  ,  6   ,  3], etc.

Any ideas on how to do that?

Thanks in advance!


The near finished script for anyone who's interested:

#Generate Cumulative Thresholds
#5/15/14

import os
import sys
import arcpy
import numpy as np

#Enable overwriting output data
arcpy.env.overwriteOutput=True

#Set working directory
os.chdir("E:/NSF Project/Salamander_Data/Continuous_Rasters/Canadian_GCM/2020/A2A/")

#Set geoprocessing variables
inRaster = "zero_eurycea_cirrigera_CA2A2020.tif"
des = arcpy.Describe(inRaster)
sr = des.SpatialReference
ext = des.Extent
ll = arcpy.Point(ext.XMin,ext.YMin)

#Convert GeoTIFF to numpy array
a = arcpy.RasterToNumPyArray(inRaster)

#Flatten for calculations
a.flatten()

#Find unique values, and record their indices to a separate object
a_unq, a_inv = np.unique(a, return_inverse=True)

#Count occurences of array indices
a_cnt = np.bincount(a_inv)

#Cumulatively sum the unique values multiplied by the number of
#occurences, arrange sums as initial array
b = np.cumsum(a_unq * a_cnt)[a_inv]

#Divide all values by 10 (reverses earlier multiplication done to
#facilitate accurate translation of ASCII scientific notation
#values < 1 to array)
b /= 10

#Rescale values between 1 and 100
maxval = np.amax(b)
b /= maxval
b *= 100

#Restore flattened array to shape of initial array
c = b.reshape(a.shape)

#Convert the array back to raster format
outRaster = arcpy.NumPyArrayToRaster(c,ll,des.meanCellWidth,des.meanCellHeight)

#Set output projection to match input
arcpy.DefineProjection_management(outRaster, sr)

#Save the raster as a TIFF
outRaster.save("C:/Users/mkcarte2/Desktop/TestData/outRaster.tif")

sys.exit()
like image 232
Vergentorix Avatar asked May 14 '14 18:05

Vergentorix


People also ask

What does Numpy Cumsum return?

cumsum. Return the cumulative sum of the elements along a given axis.

What does Cumsum mean in Python?

cumsum() function is used when we want to compute the cumulative sum of array elements over a given axis. Syntax : numpy.cumsum(arr, axis=None, dtype=None, out=None) Parameters : arr : [array_like] Array containing numbers whose cumulative sum is desired.


2 Answers

Depending on how you want to handle repeats, this could work:

In [40]: a
Out[40]: array([4, 4, 2, 1, 0, 3, 3, 1, 0, 2])

In [41]: a_unq, a_inv = np.unique(a, return_inverse=True)

In [42]: a_cnt = np.bincount(a_inv)

In [44]: np.cumsum(a_unq * a_cnt)[a_inv]
Out[44]: array([20, 20,  6,  2,  0, 12, 12,  2,  0,  6], dtype=int64)

Where of course a is your array flattened, that you would then have to reshape to the original shape.


And of course once numpy 1.9 is out, you can condense lines 41 and 42 above into the single, faster:

a_unq, a_inv, a_cnt = np.unique(a, return_inverse=True, return_counts=True)
like image 85
Jaime Avatar answered Oct 01 '22 16:10

Jaime


Edit:

This is ugly, but I think it finally works:

import numpy as np

def cond_cum_sum(my_array):
    my_list = []
    prev = -np.inf
    prev_sum = 0
    for ele in my_array:
        if prev != ele:
            prev_sum += ele
        my_list.append(prev_sum)
        prev = ele
    return np.array(my_list)

a = np.array([[4,2,2,3],
              [9,0,5,2]], dtype=np.float32)

flat_a = a.flatten()
flat_a.sort() 

temp = np.argsort(a.ravel())   

cum_sums = cond_cum_sum(flat_a)

result_1 = np.zeros(len(flat_a))
result_1[temp] = cum_sums

result = result_1.reshape(a.shape)

Result:

>>> result
array([[  9.,   2.,   2.,   5.],
       [ 23.,   0.,  14.,   2.]])
like image 41
Akavall Avatar answered Oct 01 '22 14:10

Akavall