Slicing numpy array with another array

Tags:

I've got a large one-dimensional array of integers I need to take slices off. That's trivial, I'd just do a[start:end]. The problem is that I need more of these slices. a[start:end] does not work if start and end are arrays. For loop could be used for this, but I need it to be as fast as possible (it is a bottleneck), so a native numpy solution would be welcome.

To further illustrate, I have this:

a = numpy.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], numpy.int16)
start = numpy.array([1, 5, 7], numpy.int16)
end   = numpy.array([2, 10, 9], numpy.int16)

And need to somehow make it into this:

[[1], [5, 6, 7, 8, 9], [7, 8]]

525

asked Sep 25 '12 19:09

user1698315

4 Answers

This can (almost?) be done in pure numpy using masked arrays and stride tricks. First, we create our mask:

>>> indices = numpy.arange(a.size)
>>> mask = ~((indices >= start[:,None]) & (indices < end[:,None]))

Or more simply:

>>> mask = (indices < start[:,None]) | (indices >= end[:,None])

The mask is False (i.e. values not masked) for those indices that are >= to the start value and < the end value. (Slicing with None (aka numpy.newaxis) adds a new dimension, enabling broadcasting.) Now our mask looks like this:

>>> mask
array([[ True, False,  True,  True,  True,  True,  True,  True,  True,
         True,  True,  True],
       [ True,  True,  True,  True,  True, False, False, False, False,
        False,  True,  True],
       [ True,  True,  True,  True,  True,  True,  True, False, False,
         True,  True,  True]], dtype=bool)

Now we have to stretch the array to fit the mask using stride_tricks:

>>> as_strided = numpy.lib.stride_tricks.as_strided
>>> strided = as_strided(a, mask.shape, (0, a.strides[0]))
>>> strided
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
       [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
       [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]], dtype=int16)

This looks like a 3x12 array, but each row points at the same memory. Now we can combine them into a masked array:

>>> numpy.ma.array(strided, mask=mask)
masked_array(data =
 [[-- 1 -- -- -- -- -- -- -- -- -- --]
 [-- -- -- -- -- 5 6 7 8 9 -- --]
 [-- -- -- -- -- -- -- 7 8 -- -- --]],
             mask =
 [[ True False  True  True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True False False False False False  True  True]
 [ True  True  True  True  True  True  True False False  True  True  True]],
       fill_value = 999999)

This isn't quite the same as what you asked for, but it should behave similarly.

165

answered Oct 05 '22 19:10

senderle

There is no numpy method to do this. Note that since it is irregular, it would only be a list of arrays/slices anyways. However I would like to add that for all (binary) ufuncs which are almost all functions in numpy (or they are at least based on them), there is the reduceat method, which might help you to avoid actually creating a list of slices, and thus, if the slices are small, speed up calculations too:

In [1]: a = np.arange(10)

In [2]: np.add.reduceat(a, [0,4,7]) # add up 0:4, 4:7 and 7:end
Out[2]: array([ 6, 15, 24])

In [3]: np.maximum.reduceat(a, [0,4,7]) # maximum of each of those slices
Out[3]: array([3, 6, 9])

In [4]: w = np.asarray([0,4,7,10]) # 10 for the total length

In [5]: np.add.reduceat(a, w[:-1]).astype(float)/np.diff(w) # equivalent to mean
Out[5]: array([ 1.5,  5. ,  8. ])

EDIT: Since your slices overlap, I will add that this is OK too:

# I assume that start is sorted for performance reasons.
reductions = np.column_stack((start, end)).ravel()
sums = np.add.reduceat(a, reductions)[::2]

The [::2] should be no big deal here normally, since no real extra work is done for overlapping slices.

Also there is one problem here with slices for which stop==len(a). This must be avoided. If you have exactly one slice with it, you could just do reductions = reductions[:-1] (if its the last one), but otherwise you will simply need to append a value to a to trick reduceat:

 a = np.concatenate((a, [0]))

As adding one value to the end does not matter since you work on the slices anyways.

answered Oct 05 '22 18:10

seberg

It's not a "pure" numpy solution (although as @mgilson's comment notes, it's hard to see how the irregular output could be a numpy array), but:

a = numpy.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], numpy.int16)
start = numpy.array([1, 5, 7], numpy.int16)
end   = numpy.array([2, 10, 9], numpy.int16)

map(lambda range: a[range[0]:range[1]],zip(start,end))

gets you:

[array([1], dtype=int16), array([5, 6, 7, 8, 9], dtype=int16),  array([7, 8], dtype=int16)]

as required.

answered Oct 05 '22 17:10

timday

If you want it in one line, it would be:

x=[list(a[s:e]) for (s,e) in zip(start,end)]

answered Oct 05 '22 17:10

Rosa Alejandra

Related questions
                            
                                Python UDP client/server program, problems
                            
                                web.py: how to get POST parameter and GET parameter?
                            
                                Can't install Orange: "error: command 'clang' failed with exit status 1"
                            
                                BeautifulSoup: How to replace value in an element with an element tag?
                            
                                Format string in python with variable formatting
                            
                                Parsing namespaces with clang: AST differences in when including a header in another source file or parsing it directly
                            
                                Draw rectangle (add_patch) in pylab mode
                            
                                How to make HTTP request through a (tor) socks proxy using python?
                            
                                Adding levels to MultiIndex, removing without losing
                            
                                How to display Image in pygame?
                            
                                How to get long file system path from python on Windows
                            
                                issubclass() returns False on the same class imported from different paths
                            
                                Python showing error - name 'Object' is not defined
                            
                                How to update mysql with python where fields and entries are from a dictionary?
                            
                                twisted get body of POST request
                            
                                Matplotlib: Draw a vertical arrow in a log-log plot
                            
                                Best way to delete a django model instance after a certain date
                            
                                Jinja2 and Json
                            
                                Is there something like a depth buffer in matplotlib?
                            
                                Umlauts in regexp matching (via locale?)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Slicing numpy array with another array

Tags:

python

arrays

slice

numpy