Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding median value of multiple NumPy arrays

Tags:

python

numpy

I have a for loop that creates about 50 arrays. The arrays are of length 240. I'm trying to figure out the best possible way of calculating the median values of each elements of the arrays. Essentially, I want to take the first element of each array created in the loop, put them into a list, and find the median. Then do the same for the other 239 elements. Something like this is what I'm thinking of

a = np.array([1,2,4,56,67,8,8,9]);

b = np.array([-1,-3,5,6,-7,-6,-8,0]);

c = np.array([1,2,3,4,5,6,7,8]);

d = []

d.append(a[0])

d.append(b[0])

d.append(c[0])

d
Out[62]: [1, -1, 1]

np.median(d)
Out[65]: 1.0
like image 947
may7even Avatar asked Dec 18 '22 12:12

may7even


2 Answers

Numpy.median will take the median on whatever axis you want. So if you can get all your individual arrays into a single array, you can call np.median() and get them all at once:

a = np.array([1,2,4,56,67,8,8,9]);
b = np.array([-1,-3,5,6,-7,-6,-8,0]);
c = np.array([1,2,3,4,5,6,7,8]);

d = np.stack([a, b, c])
np.median(d, axis = 0)

# array([1., 2., 4., 6., 5., 6., 7., 8.])

Of course if you you can make the 50x240 array directly without the loop, that's even better.

The timing of letting NumPy do this vs a python loop is compelling:

l = [np.random.rand(240) for _ in range(50)]

def one(l):
    return np.array(list(map(np.median, zip(*l))))

def two(l):
    d = np.stack(l)
    return np.median(d, axis = 0)

> %timeit one(l)
  17 ms ± 1.17 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
    
> %timeit two(l)
  456 µs ± 39.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each
like image 102
Mark Avatar answered Dec 30 '22 09:12

Mark


you can do this way :

medians = [np.median([a[i],b[i],c[i]]) for i in range(len(a))]
like image 28
daveturner Avatar answered Dec 30 '22 10:12

daveturner