I would like to perform an operation similar to np.where on keras tensors with tensorflow backend. That means lets say I have two tensors: diff and sum. I divide those vectors as:
rel_dev = diff / sum
For np.arrays I would write:
rel_dev = np.where((diff == 0.0) & (sum == 0.0), 0.0, rel_dev)
rel_dev = np.where((diff != 0.0) & (sum == 0.0), np.sign(diff), rel_dev)
That is, e.g. if I have zeros in both diff and sum, I wish I will not get an np.Inf, but set the rel_dev to zero. Now in keras with tensors it did not work. I have tried K.switch, K.set_value, etc. As I understand it works for the whole tensor, but not for it separate parts, right? It works without setting these conditions though, but I have no idea what actually happens where. I did not succeed to debug it yet.
Could you please tell me how to write both conditions for rel_dev in Keras?
You can do that in Keras like this:
import keras.backend as K
diff = K.constant([0, 1, 2, -2, 3, 0])
sum = K.constant([2, 4, 1, 0, 5, 0])
rel_dev = diff / sum
d0 = K.equal(diff, 0)
s0 = K.equal(sum, 0)
rel_dev = K.switch(d0 & s0, K.zeros_like(rel_dev), rel_dev)
rel_dev = K.switch(~d0 & s0, K.sign(diff), rel_dev)
print(K.eval(rel_dev))
# [ 0. 0.25 2. -1. 0.6 0. ]
EDIT: The above formulation has an insidious problem, which is that, even though the result is right, nan
values will propagate back through the gradients (namely because dividing by zero gives inf
or nan
, and multiplying inf
or nan
by zero gives nan
). Indeed, if you check the gradients:
gd, gs = K.gradients(rel_dev, (diff, sum))
print(K.eval(gd))
# [0.5 0.25 1. nan 0.2 nan]
print(K.eval(gs))
# [-0. -0.0625 -2. nan -0.12 nan]
The trick you can use to avoid that is to change sum
in the division in a way that does not affect the result but prevents the nan
values, for example like this:
import keras.backend as K
diff = K.constant([0, 1, 2, -2, 3, 0])
sum = K.constant([2, 4, 1, 0, 5, 0])
d0 = K.equal(diff, 0)
s0 = K.equal(sum, 0)
# sum zeros are replaced by ones on division
rel_dev = diff / K.switch(s0, K.ones_like(sum), sum)
rel_dev = K.switch(d0 & s0, K.zeros_like(rel_dev), rel_dev)
rel_dev = K.switch(~d0 & s0, K.sign(diff), rel_dev)
print(K.eval(rel_dev))
# [ 0. 0.25 2. -1. 0.6 0. ]
gd, gs = K.gradients(rel_dev, (diff, sum))
print(K.eval(gd))
# [0.5 0.25 1. 0. 0.2 0. ]
print(K.eval(gs))
# [-0. -0.0625 -2. 0. -0.12 0. ]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With