Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient way to do math in elements of a numpy array according to condition

I am trying to optimize my python code. One of the bottlenecks comes when I tried to apply a function to a numpy array according to each element value. For instance, I have an array with thousand elements and I apply a function for values greater than a tolerance and another function (Taylor series) for the rest. I do masking but still slow, at least I'm calling the following functions for 64 millions times.

EPSILONZETA = 1.0e-6
ZETA1_12 = 1.0/12.0
ZETA1_720 = 1.0/720.0

def masked_condition_zero(array, tolerance):
    """ Return the indices where values are lesser (and greater) than tolerance
    """
    # search indices where array values < tolerance
    indzeros_ = np.where(np.abs(array) < tolerance)[0]

    # create mask
    mask_ = np.ones(np.shape(array), dtype=bool)

    mask_[[indzeros_]] = False

    return (~mask_, mask_) 

def bernoulli_function1(zeta):
    """ Returns the Bernoulli function of zeta, vector version
    """
    # get the indices according to condition
    zeros_, others_ = masked_condition_zero(zeta, EPSILONZETA)

    # create an array filled with zeros
    fb_ = np.zeros(np.shape(zeta))

    # Apply the original function to the values greater than EPSILONZETA
    fb_[others_] = zeta[others_]/(np.exp(zeta[others_])-1.0)  

    # computes series for zeta < eps
    zeta0_ = zeta[zeros_]
    zeta2_ = zeta0_ *  zeta0_
    zeta4_ =  zeta2_ * zeta2_
    fb_[zeros_] = 1.0 - 0.5*zeta0_ + ZETA1_12 * zeta2_ - ZETA1_720 * zeta4_
    return fb_

Now suppose you have an array zeta with negative and positive floats which changes in each loop that extends to 2^26 iterations and you want to compute fbernoulli_function1(zeta) each time.

There is a better solution ?

like image 247
Caos21 Avatar asked Oct 31 '22 03:10

Caos21


1 Answers

The basic structure of the problem is:

def foo(zeta):
    result = np.empty_like(zeta)
    I = condition(zeta)
    nI = ~I
    result[I] = func1(zeta[I])
    result[nI] = func2(zeta[nI])

It looks like your polynomial expression can be evaluated at all zeta, but it is the 'exception', the fall back calculation when zeta is too close to 0.

If both functions can be evaluated for a zeta, you could use where:

np.where(condition(zeta), func1(zeta), func2(zeta))

This is streamlined version of:

def foo(zeta):
    result = np.empty_like(zeta)
    I = condition(zeta)
    nI = ~I
    v1 = func1(zeta)
    v2 = func2(zeta)
    result[I] = v1[I]
    result[nI] = v2[nI]

Another option is to apply one function to all values, and the other only to the 'exceptions'.

def foo(zeta):
    result = func2(zeta)
    I = condition(zeta)
    result[I] = func1[zeta[I]]

and of course the reverse - result = func1(zeta); result[nI]=func2[zeta].

In my brief time tests, func1, func2 take about the same time.

masked_condition_zero also takes that time, but the simpler np.abs(array) < tolerance (and it's ~J) cuts that in half.

Lets compare the allocation strategies

def foo(zeta, J, nJ):
    result = np.empty_like(zeta)
    result[J] = fun1(zeta[J])
    result[nJ] = fun2(zeta[nJ])
    return result

For a sample where zeta[J] is 10% of the full zeta, some sample times are:

In [127]: timeit foo(zeta, J, nJ)
10000 loops, best of 3: 55.7 µs per loop

In [128]: timeit result=fun2(zeta); result[J]=fun1(zeta[J])
10000 loops, best of 3: 49.2 µs per loop

In [129]: timeit np.where(J, fun1(zeta),fun2(zeta))
10000 loops, best of 3: 73.4 µs per loop

In [130]: timeit result=fun1(zeta); result[nJ]=fun2(zeta[nJ])
10000 loops, best of 3: 60.7 µs per loop

The second case is fastest because running fun1 on fewer values compensates for the added cost of indexing zeta[J]. There's a trade off between the cost of indexing and the cost of function evaluation. Boolean indexing like this is more expensive than slicing. With other mixes of values, the timings could go the other direction.

It looks like a problem where you can whittle away at the times, but I don't see any break though stategy that would cut times by an order of magnitude.

like image 119
hpaulj Avatar answered Nov 09 '22 11:11

hpaulj