What would be a good loss function to penalize the magnitude and sign difference

I'm in a situation where I need to train a model to predict a scalar value, and it's important to have the predicted value be in the same direction as the true value, while the squared error being minimum.

What would be a good choice of loss function for that?

For example:

Let's say the predicted value is -1 and the true value is 1. The loss between the two should be a lot greater than the loss between 3 and 1, even though the squared error of (3, 1) and (-1, 1) is equal.

Thanks a lot!

1 Answers

This turned out to be a really interesting question - thanks for asking it! First, remember that you want your loss functions to be defined entirely of differential operations, so that you can back-propagation though it. This means that any old arbitrary logic won't necessarily do. To restate your problem: you want to find a differentiable function of two variables that increases sharply when the two variables take on values of different signs, and more slowly when they share the same sign. Additionally, you want some control over how sharply these values increase, relative to one another. Thus, we want something with two configurable constants. I started constructing a function that met these needs, but then remembered one you can find in any high school geometry text book: the elliptic paraboloid!

A rotated elliptic paraboloid.

The standard formulation doesn't meet the requirement of sign agreement symmetry, so I had to introduce a rotation. The plot above is the result. Note that it increases more sharply when the signs don't agree, and less sharply when they do, and that the input constants controlling this behaviour are configurable. The code below is all that was needed to define and plot the loss function. I don't think I've ever used a geometric form as a loss function before - really neat.

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm

def elliptic_paraboloid_loss(x, y, c_diff_sign, c_same_sign):

    # Compute a rotated elliptic parabaloid.
    t = np.pi / 4

    x_rot = (x * np.cos(t)) + (y * np.sin(t))

    y_rot = (x * -np.sin(t)) + (y * np.cos(t))

    z = ((x_rot**2) / c_diff_sign) + ((y_rot**2) / c_same_sign)


c_diff_sign = 4

c_same_sign = 2

a = np.arange(-5, 5, 0.1)

b = np.arange(-5, 5, 0.1)

loss_map = np.zeros((len(a), len(b)))

for i, a_i in enumerate(a):

    for j, b_j in enumerate(b):

        loss_map[i, j] = elliptic_paraboloid_loss(a_i, b_j, c_diff_sign, c_same_sign)

fig = plt.figure()
ax = fig.gca(projection='3d')
X, Y = np.meshgrid(a, b)
surf = ax.plot_surface(X, Y, loss_map, cmap=cm.coolwarm,
                       linewidth=0, antialiased=False)

