I consider myself an experienced numpy user, but im not able to find a solution for the following problem. Assume there are the following arrays: <pre class="prettyprint"><code># sorted array of times t = numpy.cumsum(numpy.random.random(size = 100)) # some values associated with the times x = numpy.random.random(size=100) # some indices into the time/data array indices = numpy.cumsum(numpy.random.randint(low = 1, high=10,size = 20)) indices = indices[indices <90] # respect size of 100 if len(indices) % 2: # make number of indices even indices = indices[:-1] # select some starting and end indices istart = indices[0::2] iend = indices[1::2] </code></pre> What I now want is to reduce the value array <code>x</code> given the intervals denoted by <code>istart</code> and <code>iend</code>. I.e. <pre class="prettyprint"><code># e.g. use max reduce, I'll probably also need mean and stdv what_i_want = numpy.array([numpy.max(x[is:ie]) for is,ie in zip(istart,iend)]) </code></pre> I have already googled a lot but all I could find was blockwise operations via <code>stride_tricks</code> which only allows for regular blocks. I was not able to find a solution without performing a pyhthon loop :-( In my real application arrays are much larger and performance does matter, so i use <code>numba.jit</code> for the moment. Is there any numpy function I'm missing which is able to do that?

Have you looked at <code>ufunc.reduceat</code>? With <code>np.maximum</code>, you can do something like: <pre class="prettyprint"><code>>>> np.maximum.reduceat(x, indices) </code></pre> which yields the maximum values along the slices <code>x[indices[i]:indices[i+1]]</code>. To get what you want (<code>x[indices[2i]:indices[2i+1]</code>), you could do <pre class="prettyprint"><code>>>> np.maximum.reduceat(x, indices)[::2] </code></pre> if you don't mind the extra computations of <code>x[inidices[2i-1]:indices[2i]]</code>. This yields the following: <pre class="prettyprint"><code>>>> numpy.array([numpy.max(x[ib:ie]) for ib,ie in zip(istart,iend)]) array([ 0.60265618, 0.97866485, 0.78869449, 0.79371198, 0.15463711, 0.72413702, 0.97669218, 0.86605981]) >>> np.maximum.reduceat(x, indices)[::2] array([ 0.60265618, 0.97866485, 0.78869449, 0.79371198, 0.15463711, 0.72413702, 0.97669218, 0.86605981]) </code></pre>

Numpy blockwise reduce operations

Tags:

python

arrays

indexing

numpy

reduce

I consider myself an experienced numpy user, but im not able to find a solution for the following problem. Assume there are the following arrays:

# sorted array of times
t = numpy.cumsum(numpy.random.random(size = 100))
#  some values associated with the times
x = numpy.random.random(size=100)
# some indices into the time/data array
indices = numpy.cumsum(numpy.random.randint(low = 1, high=10,size = 20)) 
indices = indices[indices <90] # respect size of 100
if len(indices) % 2: # make number of indices even
    indices = indices[:-1]

# select some starting and end indices
istart = indices[0::2]
iend   = indices[1::2]

What I now want is to reduce the value array x given the intervals denoted by istart and iend. I.e.

# e.g. use max reduce, I'll probably also need mean and stdv
what_i_want = numpy.array([numpy.max(x[is:ie]) for is,ie in zip(istart,iend)])

I have already googled a lot but all I could find was blockwise operations via stride_tricks which only allows for regular blocks. I was not able to find a solution without performing a pyhthon loop :-( In my real application arrays are much larger and performance does matter, so i use numba.jit for the moment.

Is there any numpy function I'm missing which is able to do that?

444

asked Nov 21 '16 15:11

Marti Nito

1 Answers

Have you looked at ufunc.reduceat? With np.maximum, you can do something like:

>>> np.maximum.reduceat(x, indices)

which yields the maximum values along the slices x[indices[i]:indices[i+1]]. To get what you want (x[indices[2i]:indices[2i+1]), you could do

>>> np.maximum.reduceat(x, indices)[::2]

if you don't mind the extra computations of x[inidices[2i-1]:indices[2i]]. This yields the following:

>>> numpy.array([numpy.max(x[ib:ie]) for ib,ie in zip(istart,iend)])
array([ 0.60265618,  0.97866485,  0.78869449,  0.79371198,  0.15463711,
        0.72413702,  0.97669218,  0.86605981])

>>> np.maximum.reduceat(x, indices)[::2]
array([ 0.60265618,  0.97866485,  0.78869449,  0.79371198,  0.15463711,
        0.72413702,  0.97669218,  0.86605981])

183

answered Oct 14 '22 06:10

wflynny

Related questions
                            
                                How to search and replace text in an XML file using Python?
                            
                                Add constraints to scipy.optimize.curve_fit?
                            
                                How to read json files in Tensorflow?
                            
                                Comparison of collections containing non-reflexive elements
                            
                                Python itertools with multiprocessing - huge list vs inefficient CPUs usage with iterator
                            
                                Multiple kwargs in a function call?
                            
                                How can I test the standard input and standard output in Python Script with a Unittest test?
                            
                                Concurrency with subprocess module. How can I do this?
                            
                                Anyone successfully bundled data files into a single file with Pyinstaller?
                            
                                Keeping partly-offline sqlite db in sync with postgresql
                            
                                Docker Build can't find pip
                            
                                Python Argparse: Raw string input
                            
                                Why does printing a dataframe break python when constructed from numpy empty_like
                            
                                Derivative of summations
                            
                                Thread-safe version of mock.call_count
                            
                                Python 3 import hooks
                            
                                Computed static property in python
                            
                                Add streaming step to MR job in boto3 running on AWS EMR 5.0
                            
                                Chain lookup through queryset
                            
                                Django Rest Framework invalid username/password

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With