Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get loss values for each training instance - Keras

I want to get loss values as model train with each instance.

history = model.fit(..)

for example above code returns the loss values for each epoch not mini batch or instance.

what is the best way to do this? Any suggestions?

like image 916
e.hunnigton Avatar asked Jan 05 '18 17:01

e.hunnigton


2 Answers

There is exactly what you are looking for at the end of this official keras documentation page https://keras.io/callbacks/#callback

Here is the code to create a custom callback

class LossHistory(keras.callbacks.Callback):
    def on_train_begin(self, logs={}):
        self.losses = []

    def on_batch_end(self, batch, logs={}):
        self.losses.append(logs.get('loss'))

model = Sequential()
model.add(Dense(10, input_dim=784, kernel_initializer='uniform'))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')

history = LossHistory()
model.fit(x_train, y_train, batch_size=128, epochs=20, verbose=0, callbacks=[history])

print(history.losses)
# outputs
'''
[0.66047596406559383, 0.3547245744908703, ..., 0.25953155204159617, 0.25901699725311789]
'''
like image 50
flacout Avatar answered Oct 21 '22 09:10

flacout


If you want to get loss values for each batch, you might want to use call model.train_on_batch inside a generator. It's hard to provide a complete example without knowing your dataset, but you will have to break your dataset into batches and feed them one by one

def make_batches(...):
    ...

batches = make_batches(...)
batch_losses = [model.train_on_batch(x, y) for x, y in batches]

It's a bit more complicated with single instances. You can, of course, train on 1-sized batches, though it will most likely thrash your optimiser (by maximising gradient variance) and significantly degrade performance. Besides, since loss functions are evaluated outside of Python's domain, there is no direct way to hijack the computation without tinkering with C/C++ and CUDA sources. Even then, the backend itself evaluates the loss batch-wise (benefitting from highly vectorised matrix-operations), therefore you will severely degrade performance by forcing it to evaluate loss on each instance. Simply put, hacking the backend will only (probably) help you reduce GPU memory transfers (as compared to training on 1-sized batches from the Python interface). If you really want to get per-instance scores, I would recommend you to train on batches and evaluate on instances (this way you will avoid issues with high variance and reduce expensive gradient computations, since gradients are only estimated during training):

def make_batches(batchsize, x, y):
    ...


batchsize = n
batches = make_batches(n, ...)
batch_instances = [make_batches(1, x, y) for x, y in batches]
losses = [
    (model.train_on_batch(x, y), [model.test_on_batch(*inst) for inst in instances]) 
    for batch, instances in zip(batches, batch_instances)
]
like image 3
Eli Korvigo Avatar answered Oct 21 '22 07:10

Eli Korvigo