To be precise, the loss function that I'm looking for is the squared error when the absolute error is lesser than 0.5, and it is the absolute error itself, when the absolute error is greater than 0.5. In this way, the gradient from the error function doesn't exceed 1 because once the gradient of the squared error function reaches 1, the absolute error function kicks in, and the gradient remains constant at 1. I've included my current implementation below. For some reason, it's giving me worse performance than just the squared error.
fn_choice_maker1 = (tf.to_int32(tf.sign(y - y_ + 0.5)) + 1)/2
fn_choice_maker2 = (tf.to_int32(tf.sign(y_ - y + 0.5)) + 1)/2
choice_maker_sqr = tf.to_float(tf.mul(fn_choice_maker1, fn_choice_maker2))
sqr_contrib = tf.mul(choice_maker_sqr, tf.square(y - y_))
abs_contrib = tf.abs(y - y_)-0.25 - tf.mul(choice_maker_sqr, tf.abs(y - y_)-0.25)
loss = tf.reduce_mean(sqr_contrib + abs_contrib)
train_step = tf.train.AdamOptimizer(1e-4).minimize(loss)
choice_maker_sqr
is a column tensor that is one whenever the error is between 0.5 and -0.5. The names are pretty self explanatory.
Here is my implementation of the Huber loss function in python tensorflow:
def huber_loss(y_true, y_pred, max_grad=1.):
"""Calculates the huber loss.
Parameters
----------
y_true: np.array, tf.Tensor
Target value.
y_pred: np.array, tf.Tensor
Predicted value.
max_grad: float, optional
Positive floating point value. Represents the maximum possible
gradient magnitude.
Returns
-------
tf.Tensor
The huber loss.
"""
err = tf.abs(y_true - y_pred, name='abs')
mg = tf.constant(max_grad, name='max_grad')
lin = mg*(err-.5*mg)
quad=.5*err*err
return tf.where(err < mg, quad, lin)
You can use tf.select
to implement it in a single call:
err = y - y_
huber_loss = tf.select(tf.abs(err) < 1.0,
0.5 * tf.square(err),
tf.abs(err) - 0.5) # if, then, else
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With