Finding median value of multiple NumPy arrays

Question

I have a for loop that creates about 50 arrays. The arrays are of length 240. I'm trying to figure out the best possible way of calculating the median values of each elements of the arrays. Essentially, I want to take the first element of each array created in the loop, put them into a list, and find the median. Then do the same for the other 239 elements. Something like this is what I'm thinking of

a = np.array([1,2,4,56,67,8,8,9]);

b = np.array([-1,-3,5,6,-7,-6,-8,0]);

c = np.array([1,2,3,4,5,6,7,8]);

d = []

d.append(a[0])

d.append(b[0])

d.append(c[0])

d
Out[62]: [1, -1, 1]

np.median(d)
Out[65]: 1.0

Mark · Accepted Answer

Numpy.median will take the median on whatever axis you want. So if you can get all your individual arrays into a single array, you can call np.median() and get them all at once:

a = np.array([1,2,4,56,67,8,8,9]);
b = np.array([-1,-3,5,6,-7,-6,-8,0]);
c = np.array([1,2,3,4,5,6,7,8]);

d = np.stack([a, b, c])
np.median(d, axis = 0)

# array([1., 2., 4., 6., 5., 6., 7., 8.])

Of course if you you can make the 50x240 array directly without the loop, that's even better.

The timing of letting NumPy do this vs a python loop is compelling:

l = [np.random.rand(240) for _ in range(50)]

def one(l):
    return np.array(list(map(np.median, zip(*l))))

def two(l):
    d = np.stack(l)
    return np.median(d, axis = 0)

> %timeit one(l)
  17 ms ± 1.17 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
    
> %timeit two(l)
  456 µs ± 39.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each

daveturner · Answer

you can do this way :

medians = [np.median([a[i],b[i],c[i]]) for i in range(len(a))]

Finding median value of multiple NumPy arrays

Tags:

python

numpy

may7even

2 Answers

Mark

daveturner

Recent Activity

Donate For Us

Finding median value of multiple NumPy arrays

Tags:

python

numpy

may7even

2 Answers

Mark

daveturner

Related questions

Recent Activity

Donate For Us