python numpy where returning unexpected warning

Question

Using python 2.7, scipy 1.0.0-3

Apparently I have a misunderstanding of how the numpy where function is supposed to operate or there is a known bug in its operation. I'm hoping someone can tell me which and explain a work-around to suppress the annoying warning that I am trying to avoid. I'm getting the same behavior when I use the pandas Series where().

To make it simple, I'll use a numpy array as my example. Say I want to apply np.log() on the array and only so for the condition a value is a valid input, i.e., myArray>0.0. For values where this function should not be applied, I want to set the output flag of -999.9:

myArray = np.array([1.0, 0.75, 0.5, 0.25, 0.0])
np.where(myArray>0.0, np.log(myArray), -999.9)

I expected numpy.where() to not complain about the 0.0 value in the array since the condition is False there, yet it does and it appears to actually execute for that False condition:

-c:2: RuntimeWarning: divide by zero encountered in log 
array([  0.00000000e+00,  -2.87682072e-01,  -6.93147181e-01,
        -1.38629436e+00,  -9.99900000e+02])

The numpy documentation states:

If x and y are given and input arrays are 1-D, where is equivalent to: [xv if c else yv for (c,xv,yv) in zip(condition,x,y)]

I beg to differ with this statement since

[np.log(val) if val>0.0 else -999.9 for val in myArray]

provides no warning at all:

[0.0, -0.2876820724517809, -0.69314718055994529, -1.3862943611198906, -999.9]

So, is this a known bug? I don't want to suppress the warning for my entire code.

Paul Panzer · Accepted Answer

You can have the log evaluated at the relevant places only using its optional where parameter

np.where(myArray>0.0, np.log(myArray, where=myArray>0.0), -999.9)

or more efficiently

mask = myArray > 0.0
np.where(mask, np.log(myArray, where=mask), -999)

or if you find the "double where" ugly

np.log(myArray, where=myArray>0.0, out=np.full(myArray.shape, -999.9))

Any one of those three should suppress the warning.

hpaulj · Answer

This behavior of where should be understandable given a basic understanding of Python. This is a Python expression that uses a couple of numpy functions.

What happens in this expression?

np.where(myArray>0.0, np.log(myArray), -999.9)

The interpreter first evaluates all the arguments of the function, and then passes the results to the where. Effectively then:

cond = myArray>0.0
A = np.log(myArray)
B = -999.9
np.where(cond, A, B)

The warning is produced in the 2nd line, not in the 4th.

The 4th line is equivalent to:

[xv if c else yv for (c,xv,yv) in zip(cond, A, B)]

or

[A[i] if c else B for i,c in enumerate(cond)]

np.where is most often used with one argument, where it is a synonym for np.nonzero. We don't see this three-argument form that often on SO. It isn't that useful, in part because it doesn't save on calculations.

Masked assignment is more often, especially if there are more than 2 alternatives.

In [123]: mask = myArray>0
In [124]: out = np.full(myArray.shape, np.nan)
In [125]: out[mask] = np.log(myArray[mask])
In [126]: out
Out[126]: array([ 0.        , -0.28768207, -0.69314718, -1.38629436,         nan])

Paul Panzer showed how to do the same with the where parameter of log. That feature isn't being used as much as it could be.

In [127]: np.log(myArray, where=mask, out=out)
Out[127]: array([ 0.        , -0.28768207, -0.69314718, -1.38629436,         nan])

python numpy where returning unexpected warning

Tags:

python

pandas

numpy

user1745564

2 Answers

Paul Panzer

hpaulj

Recent Activity

Donate For Us

python numpy where returning unexpected warning

Tags:

python

pandas

numpy

user1745564

2 Answers

Paul Panzer

hpaulj

Related questions

Recent Activity

Donate For Us