Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

numpy: compute x.T*x for a large matrix

In numpy, what's the most efficient way to compute x.T * x, where x is a large (200,000 x 1000) dense float32 matrix and .T is the transpose operator?

For the avoidance of doubt, the result is 1000 x 1000.

edit: In my original question I stated that np.dot(x.T, x) was taking hours. It turned out that I had some NaNs sneak into the matrix, and for some reason that was completely killing the performance of np.dot (any insights as to why?) This is now resolved, but the original question stands.

like image 595
NPE Avatar asked Dec 07 '10 10:12

NPE


People also ask

How do you multiply matrices of different dimensions in NumPy?

For example, if matrix 1 has dimensions a * N and matrix 2 has dimensions N * b, then the resulting matrix has dimensions of a * b. To multiply two matrices use the dot() function of NumPy. It takes only 2 arguments and returns the product of two matrices.

Is Matmul same as *?

np. matmul and @ are the same thing, designed to perform matrix multiplication. @ is added to Python 3.5+ to give matrix multiplication its own infix.

How do you find the NumPy matrix size?

You can get the number of dimensions, shape (length of each dimension), and size (number of all elements) of the NumPy array with ndim , shape , and size attributes of numpy. ndarray . The built-in function len() returns the size of the first dimension.


1 Answers

This may not be the answer you're looking for, but one way to speed it up considerably is to use a gpu instead of your cpu. If you have a decently powerful graphics card around, it'll outperform your cpu any day, even if your system is very well tuned.

For nice integration with numpy, you could use theano (if your graphics card is made by nvidia). The calculation in the following code runs for me in couple of seconds (although I have a very powerful graphics card):

$ THEANO_FLAGS=device=gpu0 python
Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41) 
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import theano
Using gpu device 0: GeForce GTX 480
>>> from theano import tensor as T
>>> import numpy
>>> x = numpy.ones((200000, 1000), dtype=numpy.float32)
>>> m = T.matrix() 
>>> mTm = T.dot(m.T, m)
>>> f = theano.function([m], mTm)
>>> f(x)
array([[ 200000.,  200000.,  200000., ...,  200000.,  200000.,  200000.],
       [ 200000.,  200000.,  200000., ...,  200000.,  200000.,  200000.],
       [ 200000.,  200000.,  200000., ...,  200000.,  200000.,  200000.],
       ..., 
       [ 200000.,  200000.,  200000., ...,  200000.,  200000.,  200000.],
       [ 200000.,  200000.,  200000., ...,  200000.,  200000.,  200000.],
       [ 200000.,  200000.,  200000., ...,  200000.,  200000.,  200000.]], dtype=float32)
>>> r = f(x)
>>> r.shape
(1000, 1000)

I was going to wait to find out how long >>> numpy.dot(x.T, x) took by way of comparison, but I got bored...

You can also try PyCuda or PyOpenCL (if you don't have an nvidia graphics card), although I don't know if their numpy support is as straightforward.

like image 65
Josh Bleecher Snyder Avatar answered Oct 21 '22 11:10

Josh Bleecher Snyder