I have a generator function that infinitely cycles over some directories of images and outputs 3-tuples of batches the form
[img1, img2], label, weight
where img1
and img2
are batch_size x M x N x 3
tensors, and label
and weight
are each batch_size
x 1 tensors.
I provide this generator to the fit_generator
function when training a model with Keras.
For this model I have a custom cosine contrastive loss function,
def cosine_constrastive_loss(y_true, y_pred):
cosine_distance = 1 - y_pred
margin = 0.9
cdist = y_true * y_pred + (1 - y_true) * keras.backend.maximum(margin - y_pred, 0.0)
return keras.backend.mean(cdist)
Structurally everything runs OK with my model. There are no errors and it is consuming the inputs and labels from the generator as expected.
But now I am seeking to directly use the weights parameter per each batch and perform some customized logic inside of cosine_contrastive_loss
based on the sample-specific weight.
How can I access this parameter from the structure of a batch of samples at the moment of the loss function being executed?
Note that since it is an infinitely cycling generator, it is not possible to precompute weights or compute them on the fly to either curry the weights into the loss function or generate them.
They have to be generated in unison with the samples being generated, and indeed there is custom logic in my data generator that determines the weights dynamically from properties of img1
, img2
and label
at the moment they are generated for a batch.
The only thing I can think of is a manual training loop where you get the weights yourself.
Have a weights tensor and a non variable batch size:
weights = K.variable(np.zeros((batch_size,)))
Use them in your custom loss:
def custom_loss(true, pred):
return someCalculation(true, pred, weights)
For a "generator":
for e in range(epochs):
for s in range(steps_per_epoch):
x, y, w = next(generator) #or generator.next(), not sure
K.set_value(weights, w)
model.train_on_batch(x, y)
For a keras.utils.Sequence
:
for e in range(epochs):
for s in range(len(generator)):
x,y,w = generator[s]
K.set_value(weights, w)
model.train_on_batch(x,y)
I know this answer is not optimal because it does not parallelize getting data from the generator as it happens with fit_generator
. But it's the best easy solution I can think of. Keras didn't expose the weights, they are applied automatically in some hidden source code.
If calculating the weights can be done from x
and y
, you can delegate this task to the loss function itself.
This is sort of hacky, but may work:
input1 = Input(shape1)
input2 = Input(shape2)
# .... model creation .... #
model = Model([input1, input2], outputs)
Let the loss have access to input1
and input2
:
def custom_loss(y_true, y_pred):
w = calculate_weights(input1, input2, y_pred)
# .... rest of the loss .... #
The issue here is whether you can or not calculate the weigths as a tensor from the inputs.
The loss function in Keras Tensorflow v2 is called with the sample weighs
output_loss = loss_fn(y_true, y_pred, sample_weight=sample_weight)
https://github.com/keras-team/keras/blob/tf-2/keras/engine/training.py
You can use GradientTape for custom training, see https://www.tensorflow.org/guide/keras/train_and_evaluate#part_ii_writing_your_own_training_evaluation_loops_from_scratch
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With