Keras' `model.fit_generator()` behaves different than `model.fit()`

Tags:

I have a huge dataset that I need to provide to Keras in the form of a generator because it does not fit into memory. However, using fit_generator, I cannot replicate the results I get during usual training with model.fit. Also each epoch lasts considerably longer.

I implemented a minimal example. Maybe someone can show me where the problem is.

import random
import numpy

from keras.layers import Dense
from keras.models import Sequential

random.seed(23465298)
numpy.random.seed(23465298)

no_features = 5
no_examples = 1000


def get_model():
    network = Sequential()
    network.add(Dense(8, input_dim=no_features, activation='relu'))
    network.add(Dense(1, activation='sigmoid'))
    network.compile(loss='binary_crossentropy', optimizer='adam')
    return network


def get_data():
    example_input = [[float(f_i == e_i % no_features) for f_i in range(no_features)] for e_i in range(no_examples)]
    example_target = [[float(t_i % 2)] for t_i in range(no_examples)]
    return example_input, example_target


def data_gen(all_inputs, all_targets, batch_size=10):
    input_batch = numpy.zeros((batch_size, no_features))
    target_batch = numpy.zeros((batch_size, 1))
    while True:
        for example_index, each_example in enumerate(zip(all_inputs, all_targets)):
            each_input, each_target = each_example
            wrapped = example_index % batch_size
            input_batch[wrapped] = each_input
            target_batch[wrapped] = each_target
            if wrapped == batch_size - 1:
                yield input_batch, target_batch


if __name__ == "__main__":
    input_data, target_data = get_data()
    g = data_gen(input_data, target_data, batch_size=10)
    model = get_model()
    model.fit(input_data, target_data, epochs=15, batch_size=10)  # 15 * (1000 / 10) * 10
    # model.fit_generator(g, no_examples // 10, epochs=15)        # 15 * (1000 / 10) * 10

On my computer, model.fit always finishes the 10th epoch with a loss of 0.6939 and after ca. 2-3 seconds.

The method model.fit_generator, however, runs considerably longer and finishes the last epoch with a different loss (0.6931).

I don't understand in general why the results in both approaches differ. This might not appear like much of a difference but I need to be sure that the same data with the same net produce the same result, independent from conventional training or using the generator.

Update: @Alex R. provided an answer for part of the original problem (some of the performance issue as well as changing results with each run). As the core problem remains, however, I merely adjusted the question and title accordingly.

781

asked Aug 29 '17 16:08

wehnsdaefflae

2 Answers

Batch sizes

In fit, you're using the standard batch size = 32.
In fit_generator, you're using a batch size = 10.

Keras probably runs the weight updates after each batch, so, if you're using batches of different size, there is a chance of getting different gradients between the two methods. And once there a different weight update, both models will never meet again.

Try to use fit with batch_size=10, or use a generator with batch_size=32.

Seed problem?

Are you creating a new model with get_model() for each case?

If so, the weights in both models are different, and naturally you will have different results for both models. (Ok, you've set a seed, but if you're using tensorflow, maybe you're facing this issue)

On the long run they will sort of converge, though. The difference between both doesn't seem that much.

Checking data

If you are not sure that your generator yields the same data as you expect, do a simple loop on it and print/compare/check the data it yields:

for i in range(numberOfBatches):
    x,y = g.next() #or next(g)
    #print or compare x,y here.

answered Sep 21 '22 04:09

Daniel Möller

Make sure to shuffle your batches within your generator.

This discussion suggests you turn on shuffle in your iterator: https://github.com/keras-team/keras/issues/2389. I had the same problem and this resolved it.

answered Sep 18 '22 04:09

Cerno

Related questions
                            
                                Why are instances of the `object` class immutable in Python?
                            
                                Why can Linux accept sockets in multiprocessing?
                            
                                Altering different python objects in parallel processes, respectively
                            
                                Filter out non-zero values in a tensor
                            
                                Airflow installation successfull, but unable to run it
                            
                                CTRL-C causes forrtl: error (200) rather than python KeyboardInterrupt exception
                            
                                Position of Seaborn heatmap annotations in cells
                            
                                scikit-learn error: The least populated class in y has only 1 member
                            
                                Writing more than 4 channel images in OpenCV Python
                            
                                Why should I use a classmethod in python? [duplicate]
                            
                                Moving function/method to class
                            
                                contextlib.redirect_stdout in Python2.7
                            
                                How should I handle importing third-party libraries within my setup.py script?
                            
                                How to json.dumps byte object in python3
                            
                                Install library for jupyter notebook
                            
                                How to use lambda layer in keras?
                            
                                Django Projects and git
                            
                                IndexError: boolean index did not match indexed array along dimension 0
                            
                                How to use Keras TensorBoard callback for grid search
                            
                                How to sync Colors across Subplots of different types Seaborne / Matplotlib

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Keras' `model.fit_generator()` behaves different than `model.fit()`

Tags:

python

generator

keras

wehnsdaefflae

People also ask

2 Answers

Daniel Möller

Cerno

Recent Activity

Donate For Us