In numpy
, what's the most efficient way to compute x.T * x
, where x
is a large (200,000 x 1000) dense float32
matrix and .T
is the transpose operator?
For the avoidance of doubt, the result is 1000 x 1000.
edit: In my original question I stated that np.dot(x.T, x)
was taking hours. It turned out that I had some NaNs
sneak into the matrix, and for some reason that was completely killing the performance of np.dot
(any insights as to why?) This is now resolved, but the original question stands.
For example, if matrix 1 has dimensions a * N and matrix 2 has dimensions N * b, then the resulting matrix has dimensions of a * b. To multiply two matrices use the dot() function of NumPy. It takes only 2 arguments and returns the product of two matrices.
np. matmul and @ are the same thing, designed to perform matrix multiplication. @ is added to Python 3.5+ to give matrix multiplication its own infix.
You can get the number of dimensions, shape (length of each dimension), and size (number of all elements) of the NumPy array with ndim , shape , and size attributes of numpy. ndarray . The built-in function len() returns the size of the first dimension.
This may not be the answer you're looking for, but one way to speed it up considerably is to use a gpu instead of your cpu. If you have a decently powerful graphics card around, it'll outperform your cpu any day, even if your system is very well tuned.
For nice integration with numpy, you could use theano (if your graphics card is made by nvidia). The calculation in the following code runs for me in couple of seconds (although I have a very powerful graphics card):
$ THEANO_FLAGS=device=gpu0 python
Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import theano
Using gpu device 0: GeForce GTX 480
>>> from theano import tensor as T
>>> import numpy
>>> x = numpy.ones((200000, 1000), dtype=numpy.float32)
>>> m = T.matrix()
>>> mTm = T.dot(m.T, m)
>>> f = theano.function([m], mTm)
>>> f(x)
array([[ 200000., 200000., 200000., ..., 200000., 200000., 200000.],
[ 200000., 200000., 200000., ..., 200000., 200000., 200000.],
[ 200000., 200000., 200000., ..., 200000., 200000., 200000.],
...,
[ 200000., 200000., 200000., ..., 200000., 200000., 200000.],
[ 200000., 200000., 200000., ..., 200000., 200000., 200000.],
[ 200000., 200000., 200000., ..., 200000., 200000., 200000.]], dtype=float32)
>>> r = f(x)
>>> r.shape
(1000, 1000)
I was going to wait to find out how long >>> numpy.dot(x.T, x)
took by way of comparison, but I got bored...
You can also try PyCuda or PyOpenCL (if you don't have an nvidia graphics card), although I don't know if their numpy support is as straightforward.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With