I am training a model with different outputs in PyTorch, and I have four different losses for positions (in meter), rotations (in degree), and velocity, and a boolean value of 0 or 1 that the model has to predict.
AFAIK, there are two ways to define a final loss function here:
one - the naive weighted sum of the losses
two - the defining coefficient for each loss to optimize the final loss.
So, My question is how is better to weigh these losses to obtain the final loss, correctly?
This is not a question about programming but instead about optimization in a multi-objective setup. The two options you've described come down to the same approach which is a linear combination of the loss term. However, keep in mind there are many other approaches out there with dynamic loss weighting, uncertainty weighting, etc... In practice, the most often used approach is the linear combination where each objective gets a weight that is determined via grid-search or random-search.
You can look up this survey on multi-task learning which showcases some approaches: Multi-Task Learning for Dense Prediction Tasks: A Survey, Vandenhende et al., T-PAMI'20.
This is an active line of research, as such, there is no definite answer to your question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With