fit_generator in keras: where is the batch_size specified?

Tags:

keras

Hi I don't understand the keras fit_generator docs.

I hope my confusion is rational.

There is a batch_size and also the concept of training in in batches. Using model_fit(), I specify a batch_size of 128.

To me this means that my dataset will be fed in 128 samples at a time, thereby greatly alleviating memory. It should allow a 100 million sample dataset to be trained as long as I have got the time to wait. After all, keras is only "working with" 128 samples at a time. Right?

But I highly suspect that for specifying the batch_size alone doesn't do what I want whatsoever. Tons of memory is still being used. For my goals I need to train in batches of 128 examples each.

So I am guessing this is what fit_generator does. I really want to ask why doesn't batch_size actually work as it's name suggests?

More importantly, if fit_generator is needed, where do I specify the batch_size? The docs say to loop indefinitely. A generator loops over every row once. How do I loop over 128 samples at a time and remember where I last stopped and recall it the next time that keras asks for the next batch's starting row number (would be row 129 after first batch is done).

801

asked May 04 '17 10:05

user798719

1 Answers

You will need to handle the batch size somehow inside the generator. Here is an example to generate random batches:

import numpy as np
data = np.arange(100)
data_lab = data%2
wholeData = np.array([data, data_lab])
wholeData = wholeData.T

def data_generator(all_data, batch_size = 20):

    while True:        

        idx = np.random.randint(len(all_data), size=batch_size)

        # Assuming the last column contains labels
        batch_x = all_data[idx, :-1]
        batch_y = all_data[idx, -1]

        # Return a tuple of (Xs,Ys) to feed the model
        yield(batch_x, batch_y)

print([x for x in data_generator(wholeData)])

186

answered Sep 19 '22 15:09

mehdi

Related questions
                            
                                How to save a keras subclassed model with positional parameters in Call() method?
                            
                                How to build a Tensorflow model with more than one input?
                            
                                Updating Unrolled GAN to TF2
                            
                                Detect channels first/last of tensorflow saved model?
                            
                                ValueError: None values not supported. Code working properly on CPU/GPU but not on TPU
                            
                                Caching a computed value as a constant in TensorFlow
                            
                                Why this difference between the local response norm paper equation and tensorflow implementation?
                            
                                Implementing seq2seq with beam search
                            
                                Saving a collection of variable length tensors to a TFRecords file in TensorFlow
                            
                                How to implement element-wise 1D interpolation in Tensorflow?
                            
                                Fully Convolutional Network Training Image Size
                            
                                In TensorFlow, is it possible to use different learning rate for different part of the network?
                            
                                How to solve loss = Nan issue in Keras LSTM network?
                            
                                how to build a jar with maven for a specific OS?
                            
                                How to restore session in tensorflow? [duplicate]
                            
                                Profiling TensorFlow using tfprof
                            
                                add Batch Normalization immediately before non-linearity or after in Keras?
                            
                                Reusing layer weights in Tensorflow
                            
                                Accessing RNN weights- Tensorflow
                            
                                Variables with dynamic shape TensorFlow

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With