I am having trouble figuring out how to arrange my axes so I can perform the following operations in a vectorized way.
Essentially I have an array of vectors, an array of matrices, and I want to evaluate VMV^T for each corresponding vector V and matrix M
import numpy as np
N = 5 # normally 100k or so
vecs = np.random.rand(N, 2)
mats = np.random.rand(N, 2, 2)
output = np.array([np.dot(np.dot(vecs[i, ...], mats[i, ...]), vecs[i, ...].T) for i in range(N)])
If it's simpler, vectorizing the intermediate result below would also be helpful:
intermediate_result = np.array([np.dot(vecs[i, ...], mats[i, ...]) for i in range(N)])
# then I can do
output = np.sum(intermediate_result * vecs, axis=-1)
An einsum
-based solution is 100x faster than your loop for N = 100k:
%timeit np.array([np.dot(np.dot(vecs[i, ...], mats[i, ...]), vecs[i, ...].T) for i in range(N)])
%timeit np.einsum('...i,...ij,...j->...', vecs, mats, vecs)
np.allclose(np.array([np.dot(np.dot(vecs[i, ...], mats[i, ...]), vecs[i, ...].T) for i in range(N)]),
np.einsum('...i,...ij,...j->...', vecs, mats, vecs))
1 loops, best of 3: 640 ms per loop
100 loops, best of 3: 7.02 ms per loop
Out[45]: True
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With