I have a large (approx. 14,000 x 14,000) square matrix represented as a Numpy <code>ndarray</code>. I wish to extract a large number of rows and columns--the indices of which I know in advance, though it will in fact be all rows and columns that are not all-zero--to get a new square matrix (approx 10,000 x 10,000). The fastest way I have found to do this is: <pre class="prettyprint"><code>> timeit A[np.ix_(indices, indices)] 1 loops, best of 3: 6.19 s per loop </code></pre> However, this is much slower than the time it takes to do matrix multiplication: <pre class="prettyprint"><code>> timeit np.multiply(A, A) 1 loops, best of 3: 982 ms per loop </code></pre> This seems strange, since both the row/column extraction and matrix multiplication need to allocate a new array (which will be even larger for the result of the matrix multiplication than for the extraction), but matrix multiplication also needs to perform additional computation. Thus, the question: is there a more efficient way to perform the extraction, in particular, that is at least as fast as matrix multiplication?

If I try to reproduce your problem, I don't see such a drastic effect. I notice that depending on how many indices you choose, the indexing can even be faster than the multiplication. <pre class="prettyprint"><code>>>> import numpy as np >>> np.__version__ Out[1]: '1.9.0' >>> N = 14000 >>> A = np.random.random(size=[N, N]) >>> indices = np.sort(np.random.choice(np.arange(N), 0.9*N, replace=False)) >>> timeit A[np.ix_(indices, indices)] 1 loops, best of 3: 1.02 s per loop >>> timeit A.take(indices, axis=0).take(indices, axis=1) 1 loops, best of 3: 1.37 s per loop >>> timeit np.multiply(A,A) 1 loops, best of 3: 748 ms per loop >>> indices = np.sort(np.random.choice(np.arange(N), 0.7*N, replace=False)) >>> timeit A[np.ix_(indices, indices)] 1 loops, best of 3: 633 ms per loop >>> timeit A.take(indices, axis=0).take(indices, axis=1) 1 loops, best of 3: 946 ms per loop >>> timeit np.multiply(A,A) 1 loops, best of 3: 728 ms per loop </code></pre>

What is the fastest way to extract given rows and columns from a Numpy ndarray?

Tags:

performance

python

optimization

numpy

scipy

I have a large (approx. 14,000 x 14,000) square matrix represented as a Numpy ndarray. I wish to extract a large number of rows and columns--the indices of which I know in advance, though it will in fact be all rows and columns that are not all-zero--to get a new square matrix (approx 10,000 x 10,000).

The fastest way I have found to do this is:

> timeit A[np.ix_(indices, indices)]
1 loops, best of 3: 6.19 s per loop

However, this is much slower than the time it takes to do matrix multiplication:

> timeit np.multiply(A, A)
1 loops, best of 3: 982 ms per loop

This seems strange, since both the row/column extraction and matrix multiplication need to allocate a new array (which will be even larger for the result of the matrix multiplication than for the extraction), but matrix multiplication also needs to perform additional computation.

Thus, the question: is there a more efficient way to perform the extraction, in particular, that is at least as fast as matrix multiplication?

363

asked Aug 29 '14 19:08

jveldridge

1 Answers

If I try to reproduce your problem, I don't see such a drastic effect. I notice that depending on how many indices you choose, the indexing can even be faster than the multiplication.

>>> import numpy as np
>>> np.__version__
Out[1]: '1.9.0'
>>> N = 14000
>>> A = np.random.random(size=[N, N])

>>> indices = np.sort(np.random.choice(np.arange(N), 0.9*N, replace=False))
>>> timeit A[np.ix_(indices, indices)]
1 loops, best of 3: 1.02 s per loop
>>> timeit A.take(indices, axis=0).take(indices, axis=1)
1 loops, best of 3: 1.37 s per loop
>>> timeit np.multiply(A,A)
1 loops, best of 3: 748 ms per loop

>>> indices = np.sort(np.random.choice(np.arange(N), 0.7*N, replace=False))
>>> timeit A[np.ix_(indices, indices)]
1 loops, best of 3: 633 ms per loop
>>> timeit A.take(indices, axis=0).take(indices, axis=1)
1 loops, best of 3: 946 ms per loop
>>> timeit np.multiply(A,A)
1 loops, best of 3: 728 ms per loop

100

answered Oct 16 '22 13:10

physicalattraction

Related questions
                            
                                Fitting complex model using Python and lmfit?
                            
                                Postgres: cursor.execute("COMMIT") vs. connection.commit()
                            
                                python speed up this regex sub
                            
                                "no python application found" uWSGI + nginx + Ubuntu 13
                            
                                ImportError: dlopen failed: has bad ELF magic
                            
                                IOError: decoder jpeg not available when using Pillow
                            
                                Why does "a is b" behave differently on Interactive mode and when it's ran from script? [duplicate]
                            
                                how to pass arguments to imported script in Python
                            
                                Generating random number for a distribution of a real data?
                            
                                TypeNotFoundError after import namespace
                            
                                Capturing console output in Python
                            
                                Django: When to use multiple apps [duplicate]
                            
                                How do yield works in Python C code, good & bad part
                            
                                How can Python nosetests (version 1.1.2) be set to show logging output?
                            
                                How to package/ distribute python applications
                            
                                Selenium Webdriver - PhantomJS hangs upon send_keys() to file input element
                            
                                npm Use 2 Versions of Python
                            
                                pip says modules "weren't found" to uninstall, but pip list shows them
                            
                                Qt and PyQt hybrid application [closed]
                            
                                Using etcd to manage Django settings

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With