<pre class="prettyprint lang-py prettyprint-override"><code>def custom_asymmetric_train(y_true, y_pred): residual = (y_true - y_pred).astype("float") grad = np.where(residual>0, -2*10.0*residual, -2*residual) hess = np.where(residual>0, 2*10.0, 2.0) return grad, hess </code></pre> I want to write this statement: <pre class="prettyprint"><code> case when residual>=0 and residual<=0.5 then -2*1.2*residual when residual>=0.5 and residual<=0.7 then -2*1.*residual when residual>0.7 then -2*2*residual end ) </code></pre> however <code>np.where</code> cannot write &(and) logic . How do I write this case when logic in the <code>np.where</code> in python. Thanks

This statement can be written using np.select as: <pre class="prettyprint"><code>import numpy as np residual = np.random.rand(10) -0.3 # -0.3 to get some negative values condlist = [(residual>=0.0)&(residual<=0.5), (residual>=0.5)&(residual<=0.7), residual>0.7] choicelist = [-2*1.2*residual, -2*1.0*residual,-2*2.0*residual] residual = np.select(condlist, choicelist, default=residual) </code></pre> Note that, when multiple conditions are satisfied in <code>condlist</code>, the first one encountered is used. When all conditions evaluate to <code>False</code>, it will use the <code>default</code> value. Moreover, for your information, you need to use bitwise operator <code>&</code> on boolean numpy arrays as <code>and</code> python keyword won't work on them. Let's benchmark these answers: <pre class="prettyprint"><code>residual = np.random.rand(10000) -0.3 def charl_3where(residual): residual = np.where((residual>=0.0)&(residual<=0.5), -2*1.2*residual, residual) residual = np.where((residual>=0.5)&(residual<=0.7), -2*1.0*residual, residual) residual = np.where(residual>0.7, -2*2.0*residual, residual) return residual def yaco_select(residual): condlist = [(residual>=0.0)&(residual<=0.5), (residual>=0.5)&(residual<=0.7), residual>0.7] choicelist = [-2*1.2*residual, -2*1.0*residual,-2*2.0*residual] residual = np.select(condlist, choicelist, default=residual) return residual %timeit charl_3where(residual) >>> 112 µs ± 1.7 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) %timeit yaco_select(residual) >>> 141 µs ± 2 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) </code></pre> let's try to optimize these with <code>numba</code> <pre class="prettyprint"><code>from numba import jit @jit(nopython=True) def yaco_numba(residual): out = np.empty_like(residual) for i in range(residual.shape[0]): if residual[i]<0.0 : out[i] = residual[i] elif residual[i]<=0.5 : out[i] = -2*1.2*residual[i] elif residual[i]<=0.7: out[i] = -2*1.0*residual[i] else: # residual>0.7 out[i] = -2*2.0*residual[i] return out %timeit yaco_numba(residual) >>> 6.65 µs ± 123 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) </code></pre> Final check <pre class="prettyprint"><code>res1 = charl_3where(residual) res2 = yaco_select(residual) res3 = yaco_numba(residual) np.allclose(res1,res3) >>> True np.allclose(res2,res3) >>> True </code></pre> This one is about <code>15x</code> faster than the previously best one. Hope this helps.

How to write a case when like statement in numpy array

Tags:

python

numpy

numpy-ndarray

def custom_asymmetric_train(y_true, y_pred):
    residual = (y_true - y_pred).astype("float")
    grad = np.where(residual>0, -2*10.0*residual, -2*residual)
    hess = np.where(residual>0, 2*10.0, 2.0)
    return grad, hess

I want to write this statement:

    case when residual>=0 and residual<=0.5 then -2*1.2*residual
    when residual>=0.5 and residual<=0.7 then -2*1.*residual
    when residual>0.7 then -2*2*residual end )

however np.where cannot write &(and) logic . How do I write this case when logic in the np.where in python.

Thanks

676

asked Jan 10 '20 18:01

Mike

1 Answers

This statement can be written using np.select as:

import numpy as np

residual = np.random.rand(10) -0.3 # -0.3 to get some negative values
condlist = [(residual>=0.0)&(residual<=0.5), (residual>=0.5)&(residual<=0.7), residual>0.7]
choicelist = [-2*1.2*residual, -2*1.0*residual,-2*2.0*residual]

residual = np.select(condlist, choicelist, default=residual)

Note that, when multiple conditions are satisfied in condlist, the first one encountered is used. When all conditions evaluate to False, it will use the default value. Moreover, for your information, you need to use bitwise operator & on boolean numpy arrays as and python keyword won't work on them.

Let's benchmark these answers:

residual = np.random.rand(10000) -0.3

def charl_3where(residual):
    residual = np.where((residual>=0.0)&(residual<=0.5), -2*1.2*residual, residual)
    residual = np.where((residual>=0.5)&(residual<=0.7), -2*1.0*residual, residual)
    residual = np.where(residual>0.7, -2*2.0*residual, residual)
    return residual

def yaco_select(residual):
    condlist = [(residual>=0.0)&(residual<=0.5), (residual>=0.5)&(residual<=0.7), residual>0.7]
    choicelist = [-2*1.2*residual, -2*1.0*residual,-2*2.0*residual]
    residual = np.select(condlist, choicelist, default=residual)
    return residual


%timeit charl_3where(residual)
>>> 112 µs ± 1.7 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit yaco_select(residual)
>>> 141 µs ± 2 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

let's try to optimize these with numba

from numba import jit

@jit(nopython=True)
def yaco_numba(residual):
    out = np.empty_like(residual)
    for i in range(residual.shape[0]):
        if residual[i]<0.0 :
            out[i] = residual[i]
        elif residual[i]<=0.5 :
            out[i] = -2*1.2*residual[i]
        elif residual[i]<=0.7:
            out[i] = -2*1.0*residual[i]
        else: # residual>0.7
            out[i] = -2*2.0*residual[i]        
    return out

%timeit yaco_numba(residual)
>>> 6.65 µs ± 123 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Final check

res1 = charl_3where(residual)
res2 = yaco_select(residual)
res3 = yaco_numba(residual)
np.allclose(res1,res3)
>>> True
np.allclose(res2,res3)
>>> True

This one is about 15x faster than the previously best one. Hope this helps.

107

answered Nov 14 '22 12:11

Yacola

Related questions
                            
                                Calculate mean in pandas with even and odd columns
                            
                                Use python Requests to download an compressed tar.gzfile and unzip it using tar
                            
                                Annotated heatmap with multiple color schemes
                            
                                Jupyter: How can I interactively select series to plot using widgets.SelectMultiple()?
                            
                                fit "break IF condition" statement into one line
                            
                                How to use multiprocessing with requests module?
                            
                                How can I prevent context switching when calling an async function?
                            
                                Add timestamp and username to log
                            
                                Why does a PySpark UDF that operates on a column generated by rand() fail?
                            
                                OpenCV Rect conventions – What is x, y, width, height?
                            
                                Random list only with -1 and 1
                            
                                error: OpenCV(4.1.0) error: (-215:Assertion failed) !ssize.empty() in function 'cv::resize'
                            
                                How do I resolve "Use scipy.optimize.linear_sum_assignment instead"
                            
                                WRITE only first N rows from pandas df to csv
                            
                                UnboundLocalError: local variable 'batch_index' referenced before assignment
                            
                                No module named 'pmdarima'
                            
                                Kernel error (Errno 13 Permission denied) in Jupyter Notebook, Windows 10
                            
                                Fully convert a black and white image to a set of lines (aka vectorize using only lines)
                            
                                Matplotlib savefig background always transparent
                            
                                Dummify categorical variables for logistic regression with pandas and scikit (OneHotEncoder)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With