Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Making for loop with index arrays faster

I have the following problem: I have index arrays with repeating indices and would like to add values to an array like this:

grid_array[xidx[:],yidx[:],zidx[:]] += data[:]

However, as I have repeated indices this does not work as it should because numpy will create a temporary array which results in the data for the repeated indices being assigned several times instead of being added to each other (see http://docs.scipy.org/doc/numpy/user/basics.indexing.html).

A for loop like

for i in range(0,n):
    grid_array[xidx[i],yidx[i],zidx[i]] += data[i]

will be way to slow. Is there a way I can still use the vectorization of numpy? Or is there another way to make this assignment faster?

Thanks for your help

like image 665
numberCruncher Avatar asked Nov 21 '25 20:11

numberCruncher


2 Answers

How about using bincount?

import numpy as np

flat_index = np.ravel_multi_index([xidx, yidx, zidx], grid_array.shape)
datasum = np.bincount(flat_index, data, minlength=grid_array.size)
grid_array += datasum.reshape(grid_array.shape)
like image 148
Bi Rico Avatar answered Nov 24 '25 23:11

Bi Rico


This is a buffering issue. The .at provides unbuffered action http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.at.html#numpy.ufunc.at

np.add.at(grid_array, (xidx,yidx,zidx),data)
like image 42
hpaulj Avatar answered Nov 24 '25 22:11

hpaulj