I have several (order of 1000) 3D-arrays of shape (1000, 800, 1024) I want to study. I need to calculate the mean along axis=0, but before I can do that, I have to roll the data along axis 2, until it is 'in the right place'.
This sounds strange, so I'll try to explain. The 1D-sub-array of shape (1024,) is data from a physical ring buffer. The ring buffer is read out in different possitions, which I know. So I have several arrays pos
of shape (1000, 800). Telling me in what position the ring buffer was read out. And my 3D-arrays data
of shape (1000, 800, 1024) that I need to roll according to pos
.
Only after the rolling .. the 3D arrays are meaningful for me and I can start analysing them. In C one can write code which is doing this pretty simple so I wonder if I can kind of, 'tell' the numpy mean() or sum() routines they should start at different indices and 'roll around' at the end of the 1D-subarray.
What I currently do is this:
rolled = np.zeros_like(data) # shape (1000, 800, 1024)
for a in range(rolled.shape[0]):
for b in range(rolled.shape[1]):
rolled[a,b] = np.roll(data[a,b], pos[a,b])
This takes ~60sec And then I do e.g:
m = rolled.mean(axis=0)
s = rolled.std(axis=0)
Which only takes 15sec or so.
My point is, that making the rolled copy takes a lot of space and time (okay I could save space by writing the rolled stuff back into data
), while there is definitely a way (in C) to implement this averaging and rolling in one loop, therefor saving a lot of time.
My question is ... if there is a way to do something similar with numpy?
I got bored and wrote your function in Cython. It runs about 10x faster than the code you posted, without allocating an intermediate array.
import numpy as np
cimport numpy as np
cimport cython
from libc.math cimport sqrt
@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)
@cython.cdivision(True)
def rolled_mean_std(double[:,:,::1] data, int[:,::1] pos):
cdef Py_ssize_t s1,s2,s3,a,b,c
s1 = data.shape[0]
s2 = data.shape[1]
s3 = data.shape[2]
cdef double[:,::1] sums = np.zeros((s2,s3))
cdef double[:,::1] sumsq = np.zeros((s2,s3))
cdef double d
cdef int p
# Compute sums and sum-of-squares.
for a in range(s1):
for b in range(s2):
p = pos[a,b]
for c in range(s3):
d = data[a,b,(c+s3-p)%s3]
sums[b,c] += d
sumsq[b,c] += d * d
# Calculate mean + std in place.
for b in range(s2):
for c in range(s3):
d = sums[b,c]
sums[b,c] /= s1
sumsq[b,c] = sqrt((s1*sumsq[b,c] - (d*d)))/s1
return sums, sumsq
Note that this uses the Naive mean+stdv algorithm, so you might run into floating point accuracy errors. My tests didn't show much of an effect, though.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With