Sampling a fixed length sequence from a numpy array

Tags:

I have a data matrix a and I have list of indices stored in array idx. I would like to get 10-length data starting at each of the indices defined by idx . Right now I use a for loop to achieve this. But it is extremely slow as I have to do this data fetch about 1000 times in an iteration. Below is a minimum working example.

import numpy as np
a = np.random.random(1000)
idx = np.array([1, 5, 89, 54])

# I want "data" array to have np.array([a[1:11], a[5:15], a[89:99], a[54:64]])
# I use for loop below but it is slow
data = []

for id in idx:
    data.append(a[id:id+10])  
data = np.array(data)

Is there anyway to speed up this process? Thanks.

EDIT: My question is different from the question asked here. In the question, the size of the chunks is random in contrast to fixed chunk size in my question. Other differences exist. I do not have to use up the entire array a and an element can occur in more than one chunk. My question does not necessarily "split" the array.

924

asked Dec 12 '20 08:12

learner

1 Answers

(Thanks to suggestion from @MadPhysicist)

This should work:

a[idx.reshape(-1, 1) + np.arange(10)]

Output: Shape (L,10), where L is the length of idx

Notes:

This does not check for index-out-of-bound situations. I suppose it's easy to first ensure that idx doesn't contain such values.
Using np.take(a, idx.reshape(-1, 1) + np.arange(10), mode='wrap') is an alternative, that will handle out-of-bounds indices by wrapping them around a. Passing mode='clip' instead of mode='wrap' would clip the excessive indices to the last index of a. But then, np.take() would probably have a completely different perf. characteristic / scaling characteristic.

answered Nov 14 '22 22:11

fountainhead

Related questions
                            
                                Why can't I exclude `tests` directory from my python wheel using `exclude`?
                            
                                Raise exception in python-fastApi middleware
                            
                                How do I create a generic interface in Python?
                            
                                What's a more efficient way to calculate the max of each row in a matrix excluding its own column?
                            
                                Plotly: How to display and filter a dataframe with multiple dropdowns?
                            
                                How to add multiple annotations to a barplot
                            
                                Dropdown menu for Plotly Choropleth Map Plots
                            
                                Eigenvectors are complex but only for large matrices
                            
                                Run python script in jenkins
                            
                                Change Logdir of Ray RLlib Training instead of ~/ray_results
                            
                                How does Poetry work regarding binary dependencies? (esp. numpy)
                            
                                What this error means: `y` argument is not supported when using python generator as input
                            
                                Tkinter and 32-bit Unicode duplicating – any fix?
                            
                                Gunicorn: Failed to find attribute 'app' in 'wsgi' when attempting to start flask server
                            
                                Xarray combine_by_coords return the monotonic global index error
                            
                                how to change the python version from default 3.5 to 3.8 of google colab
                            
                                Does imblearn pipeline turn off sampling for testing?
                            
                                How to setup two PyPI indices
                            
                                "ObjectId' object is not iterable" error, while fetching data from MongoDB Atlas
                            
                                Matplotlib doesn't save image in fullscreen

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Sampling a fixed length sequence from a numpy array

Tags:

python

arrays

list

python-3.x

numpy

learner

People also ask

1 Answers

fountainhead

Recent Activity

Donate For Us