Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to correctly convert numpy vectorize to numba vectorize

Tags:

I'm trying to convert my program to Numba, but I have a problem nesting one function inside another. My methods are based on NumPy vectorize, but I can't do the same using numba. Do you know any analogous examples that I could follow?

This is my program:



import numpy as np
import scipy
import functools
from multiprocessing import Pool
import lib as mlib
from tqdm import tqdm

class vectorize(np.vectorize):
    def __get__(self, obj, objtype):
        return functools.partial(self.__call__, obj)

class stability:
    def __init__(self):

        self.r1 = 20
        self.r2 = 50
        self.rz= 20

        self.zeta = self.r2/self.r1
        self.beta = self.rz/self.r1
        self.Ms = 0.956e6
        self.delta = 1

    @vectorize
    def f(self,ro,rs):
        # print("delta=",self.delta)
        return rs/ro*np.exp( (-1/self.delta)*(ro-rs))

    @vectorize
    def mz(self,ro,rs):
        return ( 1-self.f(ro,rs)**2 ) / ( 1+self.f(ro,rs)**2 )

    @vectorize
    def mro(self,ro,rs):
        return ( 2*self.f(ro,rs) ) / ( 1+self.f(ro,rs)**2 )

    @vectorize
    def E1m(self,a, b, N,rs,d):     
        r = np.linspace(a+(b-a)/(2*N), b-(b-a)/(2*N), N)
        fx = r* ((1/self.delta+1/r)**2 * self.mro(r,rs)**2 + (1/r**2 + 1)*self.mro(r,rs)**2+d*(-(1/self.delta + 1/r) * self.mro(r,rs) + 1/r * self.mro(r,rs)*self.mz(r,rs) ))
        area = np.sum(fx)*(b-a)/N
        return area

if __name__ == "__main__":
    rs = np.arange(0,100,1)
    model = stability()
    print(model.E1m(0,20,300,rs,2))
like image 381
kking Avatar asked Sep 13 '19 13:09

kking


People also ask

What is vectorize in Numba?

Using vectorize(), you write your function as operating over input scalars, rather than arrays. Numba will generate the surrounding loop (or kernel) allowing efficient iteration over the actual inputs. The vectorize() decorator needs you to pass a list of signatures you want to support.

Does NumPy vectorize fast?

Again, some have observed vectorize to be faster than normal for loops, but even the NumPy documentation states: “The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.”

Can I use Numba with NumPy?

NumPy support in Numba comes in many forms: Numba understands calls to NumPy ufuncs and is able to generate equivalent native code for many of them. NumPy arrays are directly supported in Numba. Access to Numpy arrays is very efficient, as indexing is lowered to direct memory accesses when possible.

Is Numba better than NumPy?

Large dataFor larger input data, Numba version of function is must faster than Numpy version, even taking into account of the compiling time. In fact, the ratio of the Numpy and Numba run time will depends on both datasize, and the number of loops, or more general the nature of the function (to be compiled).


1 Answers

Most of the built-in NumPy functions are already vectorized and don't need the np.vectorize decorator at all. In general the numpy.vectorize decorator will produce very slow results (compared to NumPy)! As the documentation mentions in the Notes section:

The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.

You can improve your codes efficiency immensely by removing the decorator from f, mz, and mro. It will give the same results but run much faster (your code 10.4 seconds, changed code 0.014 seconds).

The E1m function could also be improved (in terms of performance) by using broadcasting instead of vectorize.


However since your question was about how to use numba.vectorize on these functions I have some bad news: It's not possible to use numba.vectorize on instance methods - because numba needs type information and these are not available for custom Python classes.

In general it's also best with numba to start with a pure-loop code on NumPy arrays (no vectorize) and then use the numba njit decorator (or jit(nopython=True). That won't work on methods too but it's much easier to pass in scalar arguments and only iterate over the required arrays.


But if you really want to use the vectorize approach, here's how you would do it for f:

  • You cannot use an instance method because of the self, so you need a static method or a standalone function. Because you don't have access to self you either need to pass in delta or make it global. I decided to make it an argument:
def f(ro, rs, delta):
    return rs / ro * np.exp((-1 / delta) * (ro - rs))
  • Then you need to find out the types that your arguments are (or what types you want to support) and what is returned for the signature. Your ro is an integer array, rs is a float array and delta is an integer, so the signature looks like this (the syntax is return_type(argument_1_type, argument_2_type, ....)):
@nb.vectorize('f8(i8, f8, f8)')
def f(ro, rs, delta):
    return rs / ro * np.exp((-1 / delta) * (ro - rs))

And that's basically it.

For mz and mro you can do the same (remember that you also need the delta there):

@nb.vectorize('f8(i8, f8, f8)')
def mz(ro, rs, delta):
    return (1 - f(ro, rs, delta)**2) / (1 + f(ro, rs, delta)**2)

@nb.vectorize('f8(i8, f8, f8)')
def mro(ro, rs, delta):
    return (2 * f(ro, rs, delta) ) / (1 + f(ro, rs, delta)**2)

Converting the E1m function seems to be a bit trickier (I haven't attempted it) and I leave it as an exercise for the reader.


In case you're interested how I would solve it without vectorize:

import numpy as np
import numba as nb

@nb.njit
def f(ro, rs, delta):
    return rs / ro * np.exp((-1 / delta) * (ro - rs))

@nb.njit
def mz(ro, rs, delta):
    f_2 = f(ro, rs, delta) ** 2
    return (1 - f_2) / (1 + f_2)

@nb.njit
def mro(ro, rs, delta):
    f_ = f(ro, rs, delta)
    return (2 * f_ ) / (1 + f_**2)

@nb.njit(parallel=True)
def E1m(a, b, N, rs, d):
    delta = 1
    r = np.linspace(a + (b - a) / (2 * N), b - (b - a) / (2 * N), N)
    result = np.empty(rs.size)
    for idx in nb.prange(rs.size):
        rs_item = rs[idx]
        sum_ = 0.
        for r_item in r:
            mro_ = mro(r_item, rs_item, delta)
            sum_ += r_item * ((1 / delta + 1 / r_item)**2 * mro_**2  
                              + (1 / r_item**2 + 1) * mro_**2  
                              + d * (-(1 / delta + 1 / r_item) * mro_ 
                                     + 1 / r_item * mro_ * mz(r_item, rs_item, delta)))
        result[idx] = sum_ * (b - a) / N
    return result

There's probably still a bit that could be optimized by loop-lifting or smarter approaches with the calculations but on my computer it's already pretty fast: ~100 microseconds compared to the 14 milliseconds from above, so again 100 times faster.

like image 198
MSeifert Avatar answered Nov 25 '22 08:11

MSeifert