I have a for loop that creates about 50 arrays. The arrays are of length 240. I'm trying to figure out the best possible way of calculating the median values of each elements of the arrays. Essentially, I want to take the first element of each array created in the loop, put them into a list, and find the median. Then do the same for the other 239 elements. Something like this is what I'm thinking of
a = np.array([1,2,4,56,67,8,8,9]);
b = np.array([-1,-3,5,6,-7,-6,-8,0]);
c = np.array([1,2,3,4,5,6,7,8]);
d = []
d.append(a[0])
d.append(b[0])
d.append(c[0])
d
Out[62]: [1, -1, 1]
np.median(d)
Out[65]: 1.0
Numpy.median will take the median on whatever axis you want. So if you can get all your individual arrays into a single array, you can call np.median()
and get them all at once:
a = np.array([1,2,4,56,67,8,8,9]);
b = np.array([-1,-3,5,6,-7,-6,-8,0]);
c = np.array([1,2,3,4,5,6,7,8]);
d = np.stack([a, b, c])
np.median(d, axis = 0)
# array([1., 2., 4., 6., 5., 6., 7., 8.])
Of course if you you can make the 50x240 array directly without the loop, that's even better.
The timing of letting NumPy do this vs a python loop is compelling:
l = [np.random.rand(240) for _ in range(50)]
def one(l):
return np.array(list(map(np.median, zip(*l))))
def two(l):
d = np.stack(l)
return np.median(d, axis = 0)
> %timeit one(l)
17 ms ± 1.17 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
> %timeit two(l)
456 µs ± 39.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each
you can do this way :
medians = [np.median([a[i],b[i],c[i]]) for i in range(len(a))]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With