Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to restrict the sum of predicted outputs in a neural network regression using Keras (tensorflow)

I am training a neural network in keras (python, backend: tensorflow) to function as a regression. My output layer therefore does not contain an activation function and I'm using the mean squared error as my loss-function.

My question is: I'd like to make sure the sum of all output estimates is (almost) equal to the sum of all the actual labels.

What I mean with that is: I want to make sure that not only (y_real)^i ~ (y_predict)^i for each training example i, but also guarantee sum(y_real) = sum(y_predict), summing over all i. Regular linear regressions make it simple enough to add this restriction, but I don't see anything similar for neural networks. I could just multiply the end result by sum(y_real) / sum(y_predict), but I'm afraid this isn't the ideal way to do it if I don't want to harm invidual predictions.

What other options do I have?

(I can't share my data and I can't easily reproduce the problem with different data, but this is the code that was used as requested:)

from keras.models import Sequential
from keras.layers import Dense

model = Sequential()
model.add(Dense(128, activation = 'relu', input_dim = 459))
model.add(Dense(32, activation = 'relu'))
model.add(Dense(1))

model.compile(loss = 'mean_squared_error',
              optimizer = 'adam')

model.fit(X_train, Y_train, epochs = 5, validation_data = (X_val, 
Y_val), batch_size = 128)
like image 459
Joe_P Avatar asked Mar 06 '23 16:03

Joe_P


1 Answers

From an optimization perspective, you want to introduce an equality constraint to the problem. You are looking for the network weights such that the predictions y1_hat, y2_hat and y3_hat minimize the mean squared error w.r.t the labels y1, y2, y3. In addition, you want the following to hold:

sum(y1, y2, y3) = sum(y1_hat, y2_hat, y3_hat)

Because you use a neural network, you want to impose this constraint in such a way that you can still use backpropagation to train the network.

One way of doing this is by adding a term to the loss function that penalizes differences between sum(y1, y2, y3) and sum(y1_hat, y2_hat, y3_hat).

Minimal working example:

import numpy as np
import keras.backend as K
from keras.layers import Dense, Input
from keras.models import Model

# Some random training data and labels
features = np.random.rand(100, 20)
labels = np.random.rand(100, 3)

# Simple neural net with three outputs
input_layer = Input((20,))
hidden_layer = Dense(16)(input_layer)
output_layer = Dense(3)(hidden_layer)

# Model
model = Model(inputs=input_layer, outputs=output_layer)

# Write a custom loss function
def custom_loss(y_true, y_pred):
    # Normal MSE loss
    mse = K.mean(K.square(y_true-y_pred), axis=-1)
    # Loss that penalizes differences between sum(predictions) and sum(labels)
    sum_constraint = K.square(K.sum(y_pred, axis=-1) - K.sum(y_true, axis=-1))

    return(mse+sum_constraint)

# Compile with custom loss
model.compile(loss=custom_loss, optimizer='sgd')
model.fit(features, labels, epochs=1, verbose=1)

Note that this imposes the constraint in a 'soft' way and not as a hard constraint. You will still get deviations, but the network should learn weights in such a way that these will be small.

like image 66
sdcbr Avatar answered Apr 23 '23 06:04

sdcbr