Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vectorization of this Numpy double loop

How can I vectorize the following double-loop?

I have one N by A matrix and one N by B matrix, where A and B may differ and N is much smaller than A and B. I want to produce an A by B matrix as follows, but ideally without the loops:

import numpy as np

def foo(arr):
    # can be anything - just an example so that the code runs
    return np.sum(arr)

num_a = 12
num_b = 8
num_dimensions = 3

a = np.random.rand(num_dimensions, num_a)
b = np.random.rand(num_dimensions, num_b)

# this is the loop I want to eliminate:
output = np.zeros( (num_a, num_b) )
for i in xrange(num_a):
    for j in xrange(num_b):
       output[i,j] = foo(a[:,i] - b[:,j])

Any ideas?

like image 350
YXD Avatar asked Nov 28 '11 17:11

YXD


People also ask

How does NumPy do vectorization?

The vectorized function evaluates pyfunc over successive tuples of the input arrays like the python map function, except it uses the broadcasting rules of numpy. The data type of the output of vectorized is determined by calling the function with the first element of the input.

Is NumPy vectorize faster than for loop?

Feel free to fork the notebook and try different input sizes, or different commonly used operations. Vectorized implementations (numpy) are much faster and more efficient as compared to for-loops.

What is difference between vectorization and loops?

"Vectorization" (simplified) is the process of rewriting a loop so that instead of processing a single element of an array N times, it processes (say) 4 elements of the array simultaneously N/4 times.

What is loop vectorization?

Loop vectorization transforms procedural loops by assigning a processing unit to each pair of operands. Programs spend most of their time within such loops. Therefore, vectorization can significantly accelerate them, especially over large data sets.


1 Answers

First vectorise foo(), i.e. modify foo() in a way that it can correctly operate on an array of shape (N, A, B), returning an array of shape (A, B). This step is usually the difficult one. How this is done entirely depends on what foo() does. For the given example, it's very easy to do:

def foo(arr):
    return np.sum(arr, axis=0)

Now, use broadcasting rules to create a (N, A, B) array containing all the vector differences, and pass it to foo():

foo(a[:, :, np.newaxis] - b[:, np.newaxis])
like image 86
Sven Marnach Avatar answered Oct 05 '22 18:10

Sven Marnach