Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy accumulating one array in another using index array

My question is about a specific array operation that I want to express using numpy.

I have an array of floats w and an array of indices idx of the same length as w and I want to sum up all w with the same idx value and collect them in an array v. As a loop, this looks like this:

for i, x in enumerate(w):
     v[idx[i]] += x

Is there a way to do this with array operations? My guess was v[idx] += w but that does not work, since idx contains the same index multiple times.

Thanks!

like image 232
Andreas Mueller Avatar asked Mar 20 '12 16:03

Andreas Mueller


2 Answers

numpy.bincount was introduced for this purpose:

tmp = np.bincount(idx, w)
v[:len(tmp)] += tmp

I think as of 1.6 you can also pass a minlength to bincount.

like image 56
Bi Rico Avatar answered Nov 06 '22 02:11

Bi Rico


This is a known behavior and, though somewhat unfortunate, does not have a numpy-level workaround. (bincount can be used for this if you twist its arm.) Doing the loop yourself is really your best bet.

Note that your code might have been a bit more clear without re-using the name w and without introducing another set of indices, like

for i, w_thing in zip(idx, w):
    v[i] += w_thing

If you need to speed up this loop, you might have to drop down to C. Cython makes this relatively easy.

like image 42
Mike Graham Avatar answered Nov 06 '22 02:11

Mike Graham