I am training a neural network in keras (python, backend: tensorflow) to function as a regression. My output layer therefore does not contain an activation function and I'm using the mean squared error as my loss-function.
My question is: I'd like to make sure the sum of all output estimates is (almost) equal to the sum of all the actual labels.
What I mean with that is: I want to make sure that not only (y_real)^i ~ (y_predict)^i for each training example i, but also guarantee sum(y_real) = sum(y_predict), summing over all i. Regular linear regressions make it simple enough to add this restriction, but I don't see anything similar for neural networks. I could just multiply the end result by sum(y_real) / sum(y_predict), but I'm afraid this isn't the ideal way to do it if I don't want to harm invidual predictions.
What other options do I have?
(I can't share my data and I can't easily reproduce the problem with different data, but this is the code that was used as requested:)
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(128, activation = 'relu', input_dim = 459))
model.add(Dense(32, activation = 'relu'))
model.add(Dense(1))
model.compile(loss = 'mean_squared_error',
optimizer = 'adam')
model.fit(X_train, Y_train, epochs = 5, validation_data = (X_val,
Y_val), batch_size = 128)
From an optimization perspective, you want to introduce an equality constraint to the problem. You are looking for the network weights such that the predictions y1_hat, y2_hat and y3_hat
minimize the mean squared error w.r.t the labels y1, y2, y3
. In addition, you want the following to hold:
sum(y1, y2, y3) = sum(y1_hat, y2_hat, y3_hat)
Because you use a neural network, you want to impose this constraint in such a way that you can still use backpropagation to train the network.
One way of doing this is by adding a term to the loss function that penalizes differences between sum(y1, y2, y3)
and sum(y1_hat, y2_hat, y3_hat)
.
Minimal working example:
import numpy as np
import keras.backend as K
from keras.layers import Dense, Input
from keras.models import Model
# Some random training data and labels
features = np.random.rand(100, 20)
labels = np.random.rand(100, 3)
# Simple neural net with three outputs
input_layer = Input((20,))
hidden_layer = Dense(16)(input_layer)
output_layer = Dense(3)(hidden_layer)
# Model
model = Model(inputs=input_layer, outputs=output_layer)
# Write a custom loss function
def custom_loss(y_true, y_pred):
# Normal MSE loss
mse = K.mean(K.square(y_true-y_pred), axis=-1)
# Loss that penalizes differences between sum(predictions) and sum(labels)
sum_constraint = K.square(K.sum(y_pred, axis=-1) - K.sum(y_true, axis=-1))
return(mse+sum_constraint)
# Compile with custom loss
model.compile(loss=custom_loss, optimizer='sgd')
model.fit(features, labels, epochs=1, verbose=1)
Note that this imposes the constraint in a 'soft' way and not as a hard constraint. You will still get deviations, but the network should learn weights in such a way that these will be small.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With