I have an array A that is of the shape (M,N) now I'd like to do the operation
R = (A[:,newaxis,:] * A[newaxis,:,:]).sum(2)
which should yield an (MxM) array. Now the problem is that the array is quite large and I get a Memory error, because the MxMxN array won't fit into memory.
what would be the best strategy to get this done? C? map()? or is there a special function for this yet?
thank you, David
I'm not sure how large you arrays are but the following is equivalent:
R = np.einsum('ij,kj',A,A)
And can be quite a bit faster and is much less memory intensive:
In [7]: A = np.random.random(size=(500,400))
In [8]: %timeit R = (A[:,np.newaxis,:] * A[np.newaxis,:,:]).sum(2)
1 loops, best of 3: 1.21 s per loop
In [9]: %timeit R = np.einsum('ij,kj',A,A)
10 loops, best of 3: 54 ms per loop
If I increase the size of A
to (500,4000)
, np.einsum
plows through the calculation in about 2 seconds, whereas the original formulation grinds my machine to a halt due to the size of the temporary array it has to create.
Update:
As @Jaime pointed out in the comments, np.dot(A,A.T)
is also an equivalent formulation of the problem, and can be even faster than the np.einsum
solution. Full credit to him for pointing that out, but in case he doesn't post it as a formal solution, I wanted to pull it out into the main answer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With