Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy blockwise reduce operations

I consider myself an experienced numpy user, but im not able to find a solution for the following problem. Assume there are the following arrays:

# sorted array of times
t = numpy.cumsum(numpy.random.random(size = 100))
#  some values associated with the times
x = numpy.random.random(size=100)
# some indices into the time/data array
indices = numpy.cumsum(numpy.random.randint(low = 1, high=10,size = 20)) 
indices = indices[indices <90] # respect size of 100
if len(indices) % 2: # make number of indices even
    indices = indices[:-1]

# select some starting and end indices
istart = indices[0::2]
iend   = indices[1::2]

What I now want is to reduce the value array x given the intervals denoted by istart and iend. I.e.

# e.g. use max reduce, I'll probably also need mean and stdv
what_i_want = numpy.array([numpy.max(x[is:ie]) for is,ie in zip(istart,iend)])

I have already googled a lot but all I could find was blockwise operations via stride_tricks which only allows for regular blocks. I was not able to find a solution without performing a pyhthon loop :-( In my real application arrays are much larger and performance does matter, so i use numba.jit for the moment.

Is there any numpy function I'm missing which is able to do that?

like image 444
Marti Nito Avatar asked Nov 21 '16 15:11

Marti Nito


People also ask

How does NumPy reduce work?

reduce() is equivalent to sum(). The array to act on. Axis or axes along which a reduction is performed. The default (axis = 0) is perform a reduction over the first dimension of the input array.

How can I speed up my NumPy operation?

By explicitly declaring the "ndarray" data type, your array processing can be 1250x faster. This tutorial will show you how to speed up the processing of NumPy arrays using Cython. By explicitly specifying the data types of variables in Python, Cython can give drastic speed increases at runtime.

How do you reduce the size of a NumPy Matrix?

You can use numpy. squeeze() to remove all dimensions of size 1 from the NumPy array ndarray . squeeze() is also provided as a method of ndarray .

What is NumPy block?

block() function. The block() function assembles an nd-array from nested lists of blocks. Blocks in the innermost lists are concatenated along the last dimension (-1), then these are concatenated along the second-last dimension (-2), and so on until the outermost list is reached.


1 Answers

Have you looked at ufunc.reduceat? With np.maximum, you can do something like:

>>> np.maximum.reduceat(x, indices)

which yields the maximum values along the slices x[indices[i]:indices[i+1]]. To get what you want (x[indices[2i]:indices[2i+1]), you could do

>>> np.maximum.reduceat(x, indices)[::2]

if you don't mind the extra computations of x[inidices[2i-1]:indices[2i]]. This yields the following:

>>> numpy.array([numpy.max(x[ib:ie]) for ib,ie in zip(istart,iend)])
array([ 0.60265618,  0.97866485,  0.78869449,  0.79371198,  0.15463711,
        0.72413702,  0.97669218,  0.86605981])

>>> np.maximum.reduceat(x, indices)[::2]
array([ 0.60265618,  0.97866485,  0.78869449,  0.79371198,  0.15463711,
        0.72413702,  0.97669218,  0.86605981])
like image 183
wflynny Avatar answered Oct 14 '22 06:10

wflynny