Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

keras loss function for 360 degree prediction

I'm trying to predict azimuths using keras/tensorflow. y_true ranges from 0-359, but I need a loss function that handles predictions that have wrapped around and are outside that range. Unfortunately, when I try any kind of modular division tf.mod() or %, i get an error...

LookupError: No gradient defined for operation 'FloorMod' (op type: FloorMod)

so I think I've worked around this with the following...

def mean_squared_error_360(y_true, y_pred):
  delta = K.minimum(K.minimum(K.abs(y_pred - y_true),
                              K.abs(y_pred - (360+y_true))),
                              K.abs(y_true - (360+y_pred)))
  return K.mean(K.square(delta), axis=-1)

def rmse_360(y_true, y_pred):
  return K.sqrt(mean_squared_error_360(y_true, y_pred))


model.compile(loss=mean_squared_error_360,
              optimizer=rmsprop(lr=0.0001),
              metrics=[rmse_360])

this handles the following edge cases... I haven't come across predictions < 0, so I'm not addressing.

y =   1  y_pred = 361  err = 0
y = 359  y_pred =   1  err = 2
y = 359  y_pred = 361  err = 2

Questions

  • this feels clunky; is there a smarter solution?
  • intuitively, I think there's no difference in outcome between using mean_squared_error and root_mean_squared_error as the loss... the gradients will be different, but the same optimum weights will solve both, right? Is there any reason to pick one over the other? I'd guess mse is slightly simpler than rmse, but that should be trivial. I've tried both, and using rmse 'feels' like a more orderly descent than mse... is there something about the magnitude of those squared errors that make it jump around more?

Thanks in advance.

EDIT

for whatever reason... my original mse seemed to be fitting the training set, but the validation set seemed pretty noisy epoch to epoch, without any real improvement after a few epochs. rmse seemed like a more orderly descent... until the loss went to inf after improving for a couple dozen epochs. I might have bigger issues than the loss function.

EDIT 2 - adding my implementation @Patwie answer below

ah... trig!! of course!! unfortunately, I'm using tf v1.0 which doesn't seem to have tf.atan2(). strangely, I couldn't find atan2 implementation in tf repository, but I think asos-ben's suggestion in issue 6095 does the trick. see here: https://github.com/tensorflow/tensorflow/issues/6095

def atan2(x, y, epsilon=1.0e-12):
  x = tf.where(tf.equal(x, 0.0), x+epsilon, x)
  y = tf.where(tf.equal(y, 0.0), y+epsilon, y)    
  angle = tf.where(tf.greater(x,0.0), tf.atan(y/x), tf.zeros_like(x))
  angle = tf.where(tf.logical_and(tf.less(x,0.0),  tf.greater_equal(y,0.0)), tf.atan(y/x) + np.pi, angle)
  angle = tf.where(tf.logical_and(tf.less(x,0.0),  tf.less(y,0.0)), tf.atan(y/x) - np.pi, angle)
  angle = tf.where(tf.logical_and(tf.equal(x,0.0), tf.greater(y,0.0)), 0.5*np.pi * tf.ones_like(x), angle)
  angle = tf.where(tf.logical_and(tf.equal(x,0.0), tf.less(y,0.0)), -0.5*np.pi * tf.ones_like(x), angle)
  angle = tf.where(tf.logical_and(tf.equal(x,0.0), tf.equal(y,0.0)), tf.zeros_like(x), angle)
  return angle

# y in radians
def rmse_360_2(y_true, y_pred):
  return K.mean(K.abs(atan2(K.sin(y_true - y_pred), K.cos(y_true - y_pred))))

Only about 7 epochs in on a test run, but it seems promising.

like image 817
kmh Avatar asked Sep 22 '17 00:09

kmh


1 Answers

Converting my comment into an answer. Given two angles a (gt), b (prediction) as radians you get the angle difference by

tf.atan2(tf.sin(a - b), tf.cos(a - b))

By definition tf.atan2 gives the difference automatically in the closed interval [-pi, +pi] (that is, [-180 degrees, +180 degrees]).

Hence, you can use

tf.reduce_mean(tf.abs(tf.atan2(tf.sin(a - b), tf.cos(a - b))))

I think Keras understand this TensorFlow code.

like image 132
Patwie Avatar answered Oct 14 '22 01:10

Patwie