Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to handle the last batch using keras fit_generator

I am using a customised batch generator in an attempt to fix the problem of incompatible shapes (BroadcastGradientArgs error) while using the standard model.fit() function due to the small size of the last batch in the training data. I used the batch generator mentioned here with the model.fit_generator() function:

class Generator(Sequence):
    # Class is a dataset wrapper for better training performance
    def __init__(self, x_set, y_set, batch_size=256):
        self.x, self.y = x_set, y_set
        self.batch_size = batch_size
        self.indices = np.arange(self.x.shape[0])

    def __len__(self):
        return math.floor(self.x.shape[0] / self.batch_size) 

    def __getitem__(self, idx):
        inds = self.indices[idx * self.batch_size:(idx + 1) * self.batch_size] #Line A
        batch_x = self.x[inds]
        batch_y = self.y[inds]
        return batch_x, batch_y

    def on_epoch_end(self):
        np.random.shuffle(self.indices)

But it seems that it discards the last batch if its size is smaller than the provided batch size. How can I update it to include the last batch and expand it (for example) with some repeated samples?

Also, somehow I don't get how "Line A" works!

Update: here is how I am using the generator in with my model:

# dummy model
input_1 = Input(shape=(None,))
...
dense_1 = Dense(10, activation='relu')(input_1)
output_1 = Dense(1, activation='sigmoid')(dense_1)

model = Model(input_1, output_1)
print(model.summary())

#Compile and fit_generator
model.compile(optimizer='adam', loss='binary_crossentropy')

train_data_gen = Generator(x1_train, y_train, batch_size)
test_data_gen = Generator(x1_test, y_test, batch_size)

model.fit_generator(generator=train_data_gen, validation_data = test_data_gen, epochs=epochs, shuffle=False, verbose=1)

 loss, accuracy = model.evaluate_generator(generator=test_data_gen)
print('Test Loss: %0.5f Accuracy: %0.5f' % (loss, accuracy))
like image 980
Daisy Avatar asked Jan 26 '23 09:01

Daisy


1 Answers

I thing the culprit is this line

    return math.floor(self.x.shape[0] / self.batch_size)

Replace it with this might work

    return math.ceil(self.x.shape[0] / self.batch_size) 

Imagine if you have 100 samples and batch size 32. It should divided to 3.125 batches. But if you use math.floor, it will become 3 and discord 0.125.

As for Line A, if batch size is 32, when index is 1 the [idx * self.batch_size:(idx + 1) * self.batch_size] will become [32:64], in other word, pick the 33th to 64th elements of self.indices

**Update 2, change the input to have a None shape and use LSTM and add evaluate

import os
os.environ['CUDA_VISIBLE_DEVICES'] = ""
import math
import numpy as np
from keras.models import Model
from keras.utils import Sequence
from keras.layers import Input, Dense, LSTM


class Generator(Sequence):
    # Class is a dataset wrapper for better training performance
    def __init__(self, x_set, y_set, batch_size=256):
        self.x, self.y = x_set, y_set
        self.batch_size = batch_size
        self.indices = np.arange(self.x.shape[0])

    def __len__(self):
        return math.ceil(self.x.shape[0] / self.batch_size)

    def __getitem__(self, idx):
        inds = self.indices[idx * self.batch_size:(idx + 1) * self.batch_size]  # Line A
        batch_x = self.x[inds]
        batch_y = self.y[inds]
        return batch_x, batch_y

    def on_epoch_end(self):
        np.random.shuffle(self.indices)


# dummy model
input_1 = Input(shape=(None, 10))
x = LSTM(90)(input_1)
x = Dense(10)(x)
x = Dense(1, activation='sigmoid')(x)

model = Model(input_1, x)
print(model.summary())

# Compile and fit_generator
model.compile(optimizer='adam', loss='binary_crossentropy')

x1_train = np.random.rand(1590, 20, 10)
x1_test = np.random.rand(90, 20, 10)
y_train = np.random.rand(1590, 1)
y_test = np.random.rand(90, 1)

train_data_gen = Generator(x1_train, y_train, 256)
test_data_gen = Generator(x1_test, y_test, 256)

model.fit_generator(generator=train_data_gen,
                    validation_data=test_data_gen,
                    epochs=5,
                    shuffle=False,
                    verbose=1)

loss = model.evaluate_generator(generator=test_data_gen)
print('Test Loss: %0.5f' % loss)

This run without any problem.

like image 77
Natthaphon Hongcharoen Avatar answered Jan 28 '23 23:01

Natthaphon Hongcharoen