In the code below, I have a simple for-loop that I'd like to replace with, hopefully, a faster vectorized numpy operation.
import numpy as np
b = np.array([9,8100,-60,7], dtype=np.float64)
a = np.array([584,-11,23,79,1001,0,-19], dtype=np.float64)
m = 3
n = b.shape[0]
l = n-m+1
k = a.shape[0]-m+1
QT = np.array([-85224., 181461., 580047., 8108811., 10149.])
QT_first = QT.copy()
out = [None] * l
for i in range(1, l):
QT[1:] = QT[:k-1] - b[i-1]*a[:k-1] + b[i-1+m]*a[-(k-1):]
QT[0] = QT_first[i]
# Update: Use this QT to do something with the ith element of array x
# As i updates in each iteration, QT changes
out[i] = np.argmin((QT + b_mean[i] * m) / (b_stddev[i] * m * a_stddev))
return out
I need a solution that is general enough to take on much longer arrays. Note that QT
depends on the m
and the length of b
and will always be provided.
How can I replace the for-loop with a numpy vectorized operation so that it is faster?
Update
I modified the original code to more clearly demonstrate why a convolution does not solve my problem. Convolve only gives me the final QT but I actually need to use the intermediate QT values for another calculation before updating it for the iteration of the for-loop.
I haven't studied this in enough detail to know exactly what it's doing - b[i-1]*a[:k-1] + b[i-1+m]*a[-(k-1):]
. But I suspect it can be written without an i
loop. Some sort of -b[slice1]*a[slice2] + b[slice3]*a[slice4]
.
The leaves the question of whether
for i in range(1, l):
QT[1:] = QT[:k-1] + c[i]
QT[0] = QT_first[i]
is inherently serial and iterative or not. You probably understand what is going on better than I do. I'd have to do a few manual loops to get a feel for what is happening.
Edit Please by advised that the solution below does not address what OP actually meant (only what they wrote, intially ;-)).
fftconvolve
seems to help quite a bit (opt is convolution solution):
k, n, m = 100, 100, 10
loopy 0.47971359 ms
opt 0.18304241 ms
k, n, m = 1000, 800, 20
loopy 6.44813690 ms
opt 0.33353859 ms
k, n, m = 10000, 10000, 1000
loopy 313.31501430 ms
opt 2.61437489 ms
Code:
import numpy as np
from scipy import signal
import types
from timeit import timeit
b = np.array([9,8100,-60,7], dtype=np.float64)
a = np.array([584,-11,23,79,1001,0,-19], dtype=np.float64)
m = 3
n = b.shape[0]
l = n-m+1
k = a.shape[0]-m+1
QT = np.array([-85224., 181461., 580047., 8108811., 10149.])
def mock_data(k, n, m):
QT = np.random.random((k,))
b = np.random.random((n,))
a = np.random.random((k+m-1,))
return QT, a, b, m
def f_loopy(QT, a, b, m):
k, l = a.size - m + 1, b.size - m + 1
QT, QT_first = QT.copy(), QT
for i in range(1, l):
QT[1:] = QT[:k-1] - b[i-1]*a[:k-1] + b[i-1+m]*a[-(k-1):]
QT[0] = QT_first[i]
return QT
def f_opt(QT, a, b, m):
k, l = a.size - m + 1, b.size - m + 1
ab1 = signal.fftconvolve(b[l-2::-1], a[:k-1], mode='full')[:QT.size-1]
ab2 = signal.fftconvolve(b[l-2+m:m-1:-1], a[1-k:], mode='full')[:QT.size-1]
return np.r_[0, ab2 - ab1] + np.r_[QT[l-1::-1], QT[1:QT.size+1-l]]
for k, n, m in [(100, 100, 10),
(1000, 800, 20),
(10000, 10000, 1000)]:
data = mock_data(k, n, m)
ref = f_loopy(*data)
print(f'k, n, m = {k}, {n}, {m}')
for name, func in list(globals().items()):
if not name.startswith('f_') or not isinstance(func, types.FunctionType):
continue
try:
assert np.allclose(ref, func(*data))
print("{:16s}{:16.8f} ms".format(name[2:], timeit(
'f(*data)', globals={'f':func, 'data':data}, number=10)*100))
except:
print("{:16s} apparently failed".format(name[2:]))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With