I have a numpy array with two columns:
A = [[1,1,1,2,3,1,2,3],[0.1,0.2,0.2,0.1,0.3,0.2,0.2,0.1]]
for all uniques in first column, I want average of the values corresponding to it. For example
B = [[1,2,3], [0.175, 0.15, 0.2]]
Is there a pythonic way to do this?
I think the following is the standard numpy approach for these kind of computations. The call to np.unique
can be skipped if the entries of A[0]
are small integers, but it makes the whole operation more robust and independent of the actual data.
>>> A = [[1,1,1,2,3,1,2,3],[0.1,0.2,0.2,0.1,0.3,0.2,0.2,0.1]]
>>> unq, unq_idx = np.unique(A[0], return_inverse=True)
>>> unq_sum = np.bincount(unq_idx, weights=A[1])
>>> unq_counts = np.bincount(unq_idx)
>>> unq_avg = unq_sum / unq_counts
>>> unq
array([1, 2, 3])
>>> unq_avg
array([ 0.175, 0.15 , 0.2 ])
You could of course then stack both arrays, although that will convert unq
to float dtype:
>>> np.vstack((unq, unq_avg))
array([[ 1. , 2. , 3. ],
[ 0.175, 0.15 , 0.2 ]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With