Multipying and adding inside of a too large array

Question

I have an array A that is of the shape (M,N) now I'd like to do the operation

R = (A[:,newaxis,:] * A[newaxis,:,:]).sum(2)

which should yield an (MxM) array. Now the problem is that the array is quite large and I get a Memory error, because the MxMxN array won't fit into memory.

what would be the best strategy to get this done? C? map()? or is there a special function for this yet?

thank you, David

JoshAdel · Accepted Answer

I'm not sure how large you arrays are but the following is equivalent:

R = np.einsum('ij,kj',A,A)

And can be quite a bit faster and is much less memory intensive:

In [7]: A = np.random.random(size=(500,400))

In [8]: %timeit R = (A[:,np.newaxis,:] * A[np.newaxis,:,:]).sum(2)
1 loops, best of 3: 1.21 s per loop

In [9]: %timeit R = np.einsum('ij,kj',A,A)
10 loops, best of 3: 54 ms per loop

If I increase the size of A to (500,4000), np.einsum plows through the calculation in about 2 seconds, whereas the original formulation grinds my machine to a halt due to the size of the temporary array it has to create.

Update:

As @Jaime pointed out in the comments, np.dot(A,A.T) is also an equivalent formulation of the problem, and can be even faster than the np.einsum solution. Full credit to him for pointing that out, but in case he doesn't post it as a formal solution, I wanted to pull it out into the main answer.

Multipying and adding inside of a too large array

Tags:

python

numpy

Magellan88

1 Answers

JoshAdel

Recent Activity

Donate For Us

Multipying and adding inside of a too large array

Tags:

python

numpy

Magellan88

1 Answers

JoshAdel

Related questions

Recent Activity

Donate For Us