Efficiently reshape numpy array

Tags:

I am working with NumPy arrays.

I have a 2N length vector D and want to reshape part of it into an N x N array C.

Right now this code does what I want, but is a bottleneck for larger N:

```

import numpy as np
M = 1000
t = np.arange(M)
D = np.sin(t)    # initial vector is a sin() function
N = M / 2
C = np.zeros((N,N))
for a in xrange(N):
    for b in xrange(N):
        C[a,b] = D[N + a - b]

```

Once C is made I go ahead and do some matrix arithmetic on it, etc.

This nested loop is pretty slow, but since this operation is essentially a change in indexing I figured that I could use NumPy's builtin reshape (numpy.reshape) to speed this part up.

Unfortunately, I cannot seem to figure out a good way of transforming these indices.

Any help speeding this part up?

845

asked Jun 14 '16 19:06

jjgoings

1 Answers

You can use NumPy broadcasting to remove those nested loops -

C = D[N + np.arange(N)[:,None] - np.arange(N)]

One can also use np.take to replace the indexing, like so -

C = np.take(D,N + np.arange(N)[:,None] - np.arange(N))

A closer look reveals the pattern to be close to toeplitz and hankel matrices. So, using those, we would have two more approaches to solve it, though with comparable speedups as with broadcasting. The implementations would look something like these -

from scipy.linalg import toeplitz
from scipy.linalg import hankel

C = toeplitz(D[N:],np.hstack((D[0],D[N-1:0:-1])))
C = hankel(D[1:N+1],D[N:])[:,::-1]

Runtime test

In [230]: M = 1000
     ...: t = np.arange(M)
     ...: D = np.sin(t)    # initial vector is a sin() function
     ...: N = M / 2
     ...: 

In [231]: def org_app(D,N):
     ...:     C = np.zeros((N,N))
     ...:     for a in xrange(N):
     ...:         for b in xrange(N):
     ...:             C[a,b] = D[N + a - b]
     ...:     return C
     ...: 

In [232]: %timeit org_app(D,N)
     ...: %timeit D[N + np.arange(N)[:,None] - np.arange(N)]
     ...: %timeit np.take(D,N + np.arange(N)[:,None] - np.arange(N))
     ...: %timeit toeplitz(D[N:],np.hstack((D[0],D[N-1:0:-1])))
     ...: %timeit hankel(D[1:N+1],D[N:])[:,::-1]
     ...: 
10 loops, best of 3: 83 ms per loop
100 loops, best of 3: 2.82 ms per loop
100 loops, best of 3: 2.84 ms per loop
100 loops, best of 3: 2.95 ms per loop
100 loops, best of 3: 2.93 ms per loop

answered Oct 11 '22 15:10

Divakar

Related questions
                            
                                efficiently read one file from a zip containing a lot of files in python
                            
                                Pybind11 Type Error
                            
                                BeagleBone Black OpenCV Python is too slow
                            
                                "SignatureError: Failed to verify signature" - Okta, pySAML2
                            
                                How to see full HTTPS URL in wireShark
                            
                                Bash pass string argument to python script
                            
                                Python xarray.concat then xarray.to_netcdf generates huge new file size
                            
                                Converting hard integral to lambda function with lambdify
                            
                                How do you run a python script from a C++ program?
                            
                                How can I shade an area under a curve between two lines in matplotlib / pandas?
                            
                                Add background image to 3d plot
                            
                                Where does a Python list hold its values?
                            
                                Deploy Django project on RedHat
                            
                                Access json content of http post request with Klein in python
                            
                                Sum the squared difference between 2 Numpy arrays [duplicate]
                            
                                How do I read a CSV file that's Gzipped from URL - Python [duplicate]
                            
                                What is the most efficient way to create a DataFrame from two unrelated series?
                            
                                How to get the Document Vector from Doc2Vec in gensim 0.11.1?
                            
                                Pygame.movie missing
                            
                                Prevent ipython from storing outputs in Out variable

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Efficiently reshape numpy array

Tags:

performance

python

arrays

vectorization

numpy

jjgoings

People also ask

1 Answers

Divakar

Recent Activity

Donate For Us