I am new to Keras. I need some help in writing a custom loss function in keras with TensorFlow backend for the following loss equation.
The parameters passed to the loss function are :
y_true
would be of shape (batch_size, N, 2)
. Here, we are passing N (x, y)
coordinates in each sample in the batch. y_pred
would be of shape (batch_size, 256, 256, N)
. Here, we are passing N predicted heatmaps of 256 x 256
pixels in each sample in the batch. i
∈ [0, 255]
j
∈ [0, 255]
Mn(i, j)
represents value at pixel location (i, j)
for the nth predicted heatmap.
Mn∼(i, j) = Guassian2D((i, j), y_truen, std)
where
std = standard deviation
, same standard deviation for both the dimensions (5 px).
y_truen is the nth (x, y) coordinate. This is the mean.
For details of this, please check the l2 loss described in this paper Human Pose Estimation.
Note : I mentioned batch_size in shape of y_true and y_pred. I assumed that Keras calls loss function on the entire batch and not on individual samples in the batch. Correct me if I am wrong.
def l2_loss(y_true, y_pred):
loss = 0
n = y_true.shape[0]
for j in range(n):
for i in range(num_joints):
yv, xv = tf.meshgrid(tf.arange(0, im_height), tf.arange(0, im_width))
z = np.array([xv, yv]).transpose(1, 2, 0)
ground = np.exp(-0.5*(((z - y_true[j, i, :])**2).sum(axis=2))/(sigma**2))
loss = loss + np.sum((ground - y_pred[j,:, :, i])**2)
return loss/num_joints
This is the code I have writen so far. I know that this won't run as we can't use direct numpy ndarrays inside a keras loss function. Also, I need to eliminate loops!
We can create a custom loss function in Keras by writing a function that returns a scalar and takes two arguments: namely, the true value and predicted value. Then we pass the custom loss function to model. compile as a parameter like we we would with any other loss function.
Creating custom loss functions in KerasA custom loss function can be created by defining a function that takes the true values and predicted values as required parameters. The function should return an array of losses. The function can then be passed at the compile stage.
The tensor y_true is the true data (or target, ground truth) you pass to the fit method. It's a conversion of the numpy array y_train into a tensor. The tensor y_pred is the data predicted (calculated, output) by your model.
Loss is used to calculate the gradients for the neural net. And gradients are used to update the weights. This is how a Neural Net is trained. Keras has many inbuilt loss functions, which I have covered in one of my previous blog.
You can pretty much just translate the numpy functions into Keras backend functions. The only thing to notice is to set up the right broadcast shape.
def l2_loss_keras(y_true, y_pred):
# set up meshgrid: (height, width, 2)
meshgrid = K.tf.meshgrid(K.arange(im_height), K.arange(im_width))
meshgrid = K.cast(K.transpose(K.stack(meshgrid)), K.floatx())
# set up broadcast shape: (batch_size, height, width, num_joints, 2)
meshgrid_broadcast = K.expand_dims(K.expand_dims(meshgrid, 0), -2)
y_true_broadcast = K.expand_dims(K.expand_dims(y_true, 1), 2)
diff = meshgrid_broadcast - y_true_broadcast
# compute loss: first sum over (height, width), then take average over num_joints
ground = K.exp(-0.5 * K.sum(K.square(diff), axis=-1) / sigma ** 2)
loss = K.sum(K.square(ground - y_pred), axis=[1, 2])
return K.mean(loss, axis=-1)
To verify it:
def l2_loss_numpy(y_true, y_pred):
loss = 0
n = y_true.shape[0]
for j in range(n):
for i in range(num_joints):
yv, xv = np.meshgrid(np.arange(0, im_height), np.arange(0, im_width))
z = np.stack([xv, yv]).transpose(1, 2, 0)
ground = np.exp(-0.5*(((z - y_true[j, i, :])**2).sum(axis=2))/(sigma**2))
loss = loss + np.sum((ground - y_pred[j,:, :, i])**2)
return loss/num_joints
batch_size = 32
num_joints = 10
sigma = 5
im_width = 256
im_height = 256
y_true = 255 * np.random.rand(batch_size, num_joints, 2)
y_pred = 255 * np.random.rand(batch_size, im_height, im_width, num_joints)
print(l2_loss_numpy(y_true, y_pred))
45448272129.0
print(K.eval(l2_loss_keras(K.variable(y_true), K.variable(y_pred))).sum())
4.5448e+10
The number is truncated under the default dtype
float32. If you run it with dtype
set to float64:
y_true = 255 * np.random.rand(batch_size, num_joints, 2)
y_pred = 255 * np.random.rand(batch_size, im_height, im_width, num_joints)
print(l2_loss_numpy(y_true, y_pred))
45460126940.6
print(K.eval(l2_loss_keras(K.variable(y_true), K.variable(y_pred))).sum())
45460126940.6
EDIT:
It seems that Keras requires y_true
and y_pred
to have the same number of dimensions. For example, on the following testing model:
X = np.random.rand(batch_size, 256, 256, 3)
model = Sequential([Dense(10, input_shape=(256, 256, 3))])
model.compile(loss=l2_loss_keras, optimizer='adam')
model.fit(X, y_true, batch_size=8)
ValueError: Cannot feed value of shape (8, 10, 2) for Tensor 'dense_2_target:0', which has shape '(?, ?, ?, ?)'
To deal with this problem, you can add a dummy dimension with expand_dims
before feeding y_true
into the model:
def l2_loss_keras(y_true, y_pred):
...
y_true_broadcast = K.expand_dims(y_true, 1) # change this line
...
model.fit(X, np.expand_dims(y_true, axis=1), batch_size=8)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With