Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras' `model.fit_generator()` behaves different than `model.fit()`

I have a huge dataset that I need to provide to Keras in the form of a generator because it does not fit into memory. However, using fit_generator, I cannot replicate the results I get during usual training with model.fit. Also each epoch lasts considerably longer.

I implemented a minimal example. Maybe someone can show me where the problem is.

import random
import numpy

from keras.layers import Dense
from keras.models import Sequential

random.seed(23465298)
numpy.random.seed(23465298)

no_features = 5
no_examples = 1000


def get_model():
    network = Sequential()
    network.add(Dense(8, input_dim=no_features, activation='relu'))
    network.add(Dense(1, activation='sigmoid'))
    network.compile(loss='binary_crossentropy', optimizer='adam')
    return network


def get_data():
    example_input = [[float(f_i == e_i % no_features) for f_i in range(no_features)] for e_i in range(no_examples)]
    example_target = [[float(t_i % 2)] for t_i in range(no_examples)]
    return example_input, example_target


def data_gen(all_inputs, all_targets, batch_size=10):
    input_batch = numpy.zeros((batch_size, no_features))
    target_batch = numpy.zeros((batch_size, 1))
    while True:
        for example_index, each_example in enumerate(zip(all_inputs, all_targets)):
            each_input, each_target = each_example
            wrapped = example_index % batch_size
            input_batch[wrapped] = each_input
            target_batch[wrapped] = each_target
            if wrapped == batch_size - 1:
                yield input_batch, target_batch


if __name__ == "__main__":
    input_data, target_data = get_data()
    g = data_gen(input_data, target_data, batch_size=10)
    model = get_model()
    model.fit(input_data, target_data, epochs=15, batch_size=10)  # 15 * (1000 / 10) * 10
    # model.fit_generator(g, no_examples // 10, epochs=15)        # 15 * (1000 / 10) * 10

On my computer, model.fit always finishes the 10th epoch with a loss of 0.6939 and after ca. 2-3 seconds.

The method model.fit_generator, however, runs considerably longer and finishes the last epoch with a different loss (0.6931).

I don't understand in general why the results in both approaches differ. This might not appear like much of a difference but I need to be sure that the same data with the same net produce the same result, independent from conventional training or using the generator.

Update: @Alex R. provided an answer for part of the original problem (some of the performance issue as well as changing results with each run). As the core problem remains, however, I merely adjusted the question and title accordingly.

like image 781
wehnsdaefflae Avatar asked Aug 29 '17 16:08

wehnsdaefflae


People also ask

What is the difference between model fit and model Fit_generator?

You pass your whole dataset at once in fit method. Also, use it if you can load whole data into your memory (small dataset). In fit_generator() , you don't pass the x and y directly, instead they come from a generator.

What is the difference between fit and fit generator?

fit is used when the entire training dataset can fit into the memory and no data augmentation is applied. . fit_generator is used when either we have a huge dataset to fit into our memory or when data augmentation needs to be applied.

What method is used to fit a model on batches from an ImageDataGenerator?

After you have created and configured your ImageDataGenerator, you must fit it on your data. This will calculate any statistics required to actually perform the transforms to your image data. You can do this by calling the fit() function on the data generator and passing it to your training dataset.


2 Answers

Batch sizes

  • In fit, you're using the standard batch size = 32.
  • In fit_generator, you're using a batch size = 10.

Keras probably runs the weight updates after each batch, so, if you're using batches of different size, there is a chance of getting different gradients between the two methods. And once there a different weight update, both models will never meet again.

Try to use fit with batch_size=10, or use a generator with batch_size=32.


Seed problem?

Are you creating a new model with get_model() for each case?

If so, the weights in both models are different, and naturally you will have different results for both models. (Ok, you've set a seed, but if you're using tensorflow, maybe you're facing this issue)

On the long run they will sort of converge, though. The difference between both doesn't seem that much.


Checking data

If you are not sure that your generator yields the same data as you expect, do a simple loop on it and print/compare/check the data it yields:

for i in range(numberOfBatches):
    x,y = g.next() #or next(g)
    #print or compare x,y here. 

like image 83
Daniel Möller Avatar answered Sep 21 '22 04:09

Daniel Möller


Make sure to shuffle your batches within your generator.

This discussion suggests you turn on shuffle in your iterator: https://github.com/keras-team/keras/issues/2389. I had the same problem and this resolved it.

like image 39
Cerno Avatar answered Sep 18 '22 04:09

Cerno