Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Numpy Vectorize on Functions that Return Vectors

numpy.vectorize takes a function f:a->b and turns it into g:a[]->b[].

This works fine when a and b are scalars, but I can't think of a reason why it wouldn't work with b as an ndarray or list, i.e. f:a->b[] and g:a[]->b[][]

For example:

import numpy as np def f(x):     return x * np.array([1,1,1,1,1], dtype=np.float32) g = np.vectorize(f, otypes=[np.ndarray]) a = np.arange(4) print(g(a)) 

This yields:

array([[ 0.  0.  0.  0.  0.],        [ 1.  1.  1.  1.  1.],        [ 2.  2.  2.  2.  2.],        [ 3.  3.  3.  3.  3.]], dtype=object) 

Ok, so that gives the right values, but the wrong dtype. And even worse:

g(a).shape 

yields:

(4,) 

So this array is pretty much useless. I know I can convert it doing:

np.array(map(list, a), dtype=np.float32) 

to give me what I want:

array([[ 0.,  0.,  0.,  0.,  0.],        [ 1.,  1.,  1.,  1.,  1.],        [ 2.,  2.,  2.,  2.,  2.],        [ 3.,  3.,  3.,  3.,  3.]], dtype=float32) 

but that is neither efficient nor pythonic. Can any of you guys find a cleaner way to do this?

Thanks in advance!

like image 489
prodigenius Avatar asked Jul 31 '10 18:07

prodigenius


People also ask

What does NumPy vectorize do?

The vectorized function evaluates pyfunc over successive tuples of the input arrays like the python map function, except it uses the broadcasting rules of numpy. The data type of the output of vectorized is determined by calling the function with the first element of the input.

Is NumPy vectorize faster than for loop?

Again, some have observed vectorize to be faster than normal for loops, but even the NumPy documentation states: “The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.”

Why should we try to use vectorized operations with NumPy arrays?

The concept of vectorized operations on NumPy allows the use of more optimal and pre-compiled functions and mathematical operations on NumPy array objects and data sequences. The Output and Operations will speed up when compared to simple non-vectorized operations.


1 Answers

np.vectorize is just a convenience function. It doesn't actually make code run any faster. If it isn't convenient to use np.vectorize, simply write your own function that works as you wish.

The purpose of np.vectorize is to transform functions which are not numpy-aware (e.g. take floats as input and return floats as output) into functions that can operate on (and return) numpy arrays.

Your function f is already numpy-aware -- it uses a numpy array in its definition and returns a numpy array. So np.vectorize is not a good fit for your use case.

The solution therefore is just to roll your own function f that works the way you desire.

like image 118
unutbu Avatar answered Oct 15 '22 15:10

unutbu