I am doing multivariate regression with a fully connected multilayer neural network in Tensorflow. The network predicts 2 continuous float variables (y1,y2)
given an input vector (x1,x2,...xN)
, i.e. the network has 2 output nodes. With 2 outputs the network does not seem to converge. My loss function is essentially the L2 distance between the prediction and truth vectors (each contains 2 scalars):
loss = tf.nn.l2_loss(tf.sub(prediction, truthValues_placeholder)) + L2regularizationLoss
I am using L2 regularization, dropout regularization, and my activation functions are tanh.
My questions: Is L2 distance the proper way to calculate loss for a multivariate network output? Are there some tricks needed to get multivariate regression networks to converge (as opposed to single-variable networks and classifiers)?
Yes, you can use L2 distance for multivariate regression. But I would recommended experimenting with an absolute L1 distance as well.
One of the problems with L2 is its susceptibility to outliers, and with L1 is the non-smooth nature at the origin.
You can fix both of these issues using the Huber Loss, which acts like L2 near the origin, and like absolute L1 as you move away from the origin.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With