Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vectorize iterative addition in NumPy arrays

For each element in a randomized array of 2D indices (with potential duplicates), I want to "+=1" to the corresponding grid in a 2D zero array. However, I don't know how to optimize the computation. Using the standard for loop, as shown here,

def interadd():
    U = 100 
    input = np.random.random(size=(5000,2)) * U
    idx = np.floor(input).astype(np.int) 

    grids = np.zeros((U,U))      
    for i in range(len(input)):
        grids[idx[i,0],idx[i,1]] += 1
    return grids

The runtime can be quite significant:

>> timeit(interadd, number=5000)
43.69953393936157

Is there a way to vectorize this iterative process?

like image 682
neither-nor Avatar asked Jun 27 '15 22:06

neither-nor


People also ask

What does vectorize do in NumPy?

The vectorized function evaluates pyfunc over successive tuples of the input arrays like the python map function, except it uses the broadcasting rules of numpy. The data type of the output of vectorized is determined by calling the function with the first element of the input.

Does NumPy vectorize fast?

Again, some have observed vectorize to be faster than normal for loops, but even the NumPy documentation states: “The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.”

How do I create a vector array in NumPy?

In order to create a vector, we use np. array method. Note: We can create vector with other method as well which return 1-D numpy array for example np. arange(10), np.

Is NumPy a vectorization?

Vectorization is a term used outside of numpy, and in very basic terms is parallelisation of calculations. In normal python this would be done element by element using something like a for loop, so four calculations one after the other.


1 Answers

You could speed it up a little by using np.add.at, which correctly handles the case of duplicate indices:

def interadd(U, idx):
    grids = np.zeros((U,U))      
    for i in range(len(idx)):
        grids[idx[i,0],idx[i,1]] += 1
    return grids

def interadd2(U, idx):
    grids = np.zeros((U,U))
    np.add.at(grids, idx.T.tolist(), 1)
    return grids

def interadd3(U, idx):
    # YXD suggestion
    grids = np.zeros((U,U))
    np.add.at(grids, (idx[:,0], idx[:,1]), 1)
    return grids

which gives

>>> U = 100
>>> idx = np.floor(np.random.random(size=(5000,2))*U).astype(np.int)
>>> (interadd(U, idx) == interadd2(U, idx)).all()
True
>>> %timeit interadd(U, idx)
100 loops, best of 3: 8.48 ms per loop
>>> %timeit interadd2(U, idx)
100 loops, best of 3: 2.62 ms per loop

And YXD's suggestion:

>>> (interadd(U, idx) == interadd3(U, idx)).all()
True
>>> %timeit interadd3(U, idx)
1000 loops, best of 3: 1.09 ms per loop
like image 141
DSM Avatar answered Oct 13 '22 00:10

DSM