I have a k*n
matrix X, and an k*k
matrix A. For each column of X
, I'd like to calculate the scalar
X[:, i].T.dot(A).dot(X[:, i])
(or, mathematically, Xi' * A * Xi
).
Currently, I have a for
loop:
out = np.empty((n,))
for i in xrange(n):
out[i] = X[:, i].T.dot(A).dot(X[:, i])
but since n
is large, I'd like to do this faster if possible (i.e. using some NumPy functions instead of a loop).
This seems to do it nicely:
(X.T.dot(A)*X.T).sum(axis=1)
Edit: This is a little faster. np.einsum('...i,...i->...', X.T.dot(A), X.T)
. Both work better if X
and A
are Fortran contiguous.
You can use the numpy.einsum
:
np.einsum('ji,jk,ki->i',x,a,x)
This will get the same result. Let's see if it is much faster:
Looks like dot
is still the fastest option, particularly because it uses threaded BLAS, as opposed to einsum
which runs on one core.
import perfplot
import numpy as np
def setup(n):
k = n
x = np.random.random((k, n))
A = np.random.random((k, k))
return x, A
def loop(data):
x, A = data
n = x.shape[1]
out = np.empty(n)
for i in range(n):
out[i] = x[:, i].T.dot(A).dot(x[:, i])
return out
def einsum(data):
x, A = data
return np.einsum('ji,jk,ki->i', x, A, x)
def dot(data):
x, A = data
return (x.T.dot(A)*x.T).sum(axis=1)
perfplot.show(
setup=setup,
kernels=[loop, einsum, dot],
n_range=[2**k for k in range(10)],
logx=True,
logy=True,
xlabel='n, k'
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With