Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Custom loss function in Keras/Tensorflow with if statement

I need to create a custom loss function in Keras and depending on the result of the conditional return two different loss values. I am having trouble getting the if statement to run properly.

I need to do something similar to this:

def custom_loss(y_true, y_pred):
    sees = tf.Session()
    const = 2
    if (sees.run(tf.keras.backend.less(y_pred, y_true))): #i.e. y_pred - y_true < 0
        return const * mean_squared_error(y_true, y_pred)
    else:
        return mean_squared_error(y_true, y_pred)

I keep getting tensor errors (see below) when trying to run this. Any help/advice will be appreciated!

InvalidArgumentError: You must feed a value for placeholder tensor 'dense_63_target' with dtype float and shape [?,?]
 [[Node: dense_63_target = Placeholder[dtype=DT_FLOAT, shape=[?,?], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
like image 892
philly2013 Avatar asked Apr 02 '26 05:04

philly2013


1 Answers

You should instead multiply simply by a mask in order to get your desired function

import keras.backend as K
def custom_1loss(y_true, y_pred):
    const = 2
    mask = K.less(y_pred, y_true) #i.e. y_pred - y_true < 0
    return (const - 1) * mask * mean_squared_error(y_true, y_pred) + mean_squared_error(y_true, y_pred)

which has the same desired output as when y_pred is an under-prediction, another MSE term is added. You may have to cast the mask to an integer tensor - I do not remember what specific types - but it would be a minor change.

Also as unsolicited advice to your approach in general. I think you would get better results with a different approach to loss.

import keras.backend as K
def custom_loss2(y_true, y_pred):
    beta = 0.1
    return mean_squared_error(y_true, y_pred) + beta*K.mean(y_true - y_pred)

observe the difference in gradient behavior:

https://www.desmos.com/calculator/uubwgdhpi6

the second loss function I show you shifts the moment of the local minimum to be a minor over prediction rather than an under prediction (based on what you want). The loss function you give still locally optimizes to mean 0 but with different strength gradients. This will most likely result in simply a slower convergence to the same result as MSE rather than desiring a model that would rather over-predict then under predict. I hope this makes sense.

like image 180
modesitt Avatar answered Apr 03 '26 21:04

modesitt



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!