Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to repeat data with flow_from_directory in Keras

I am trying to use keras flow_from_directory to train a model. But it does not repeat data after the epoch(i.e. when all the data has been iterated). I could not find any option to do so either. Below is my code for data generation while training. For example if total images = 70 batch_size = 32 then in 1st and 2nd iteration is gives 32 images, but in third it gives 6 images.

# data generation from directory without labels  
trn = datagen.flow_from_directory(os.path.join(BASE, 'train_gen'),
                                         batch_size=batch_size,
                                         target_size=(inp_shape[:2]),
                                         class_mode=None)
X = trn.next() # getting a batch of data.

I want the data generator to start repeating data after it's exhausted.

Actually I am trying to train a GAN, where a batch images are generated from Generator-Model and then it is concatenated with a batch of real images and then passed to Discriminator-Model and GAN-Model to train. I can't figure out how can I use fit_generator in this, Code is as below:

def train(self, inp_shape, batch_size=1, n_epochs=1000):
    BASE = '/content/gdrive/My Drive/Dataset/GAN'

    datagen = ImageDataGenerator(rescale=1./255)
    trn_dist = datagen.flow_from_directory(os.path.join(BASE, 'train_gen'),
                                                      batch_size=batch_size,
                                                      target_size=(inp_shape[:2]),
                                                      seed = 1360000,
                                                      class_mode=None)

    val_dist = datagen.flow_from_directory(os.path.join(BASE, 'test_gen'),
                                                      batch_size=batch_size,
                                                      target_size=(inp_shape[:2]),
                                                      class_mode=None)

    trn_real = datagen.flow_from_directory(os.path.join(BASE, 'train_real'),
                                                      batch_size=batch_size,
                                                      target_size=(inp_shape[:2]),
                                                      seed = 1360000,
                                                      class_mode=None)

    for e in range(n_epochs):

      real_images = trn_real.next()

      dist_images = trn_dist.next()

      gen_images = self.generator.predict(dist_images)

      factor = inp_shape[0]/250
      gen_res = ndi.zoom(gen_images, (1, factor, factor, 1), order=2)      

      X = np.concatenate([real_images, gen_res])

      y = np.zeros(2*batch_size)
      y[:batch_size] = 1.

      self.discriminator.trainable = True
      self.discriminator.fit(X, y, batch, n_epochs)

      self.discriminator.trainable = False

      self.model.fit(gen_res, y[:batch_size])
      print ('> training --- epoch=%d/%d' % (e, n_epochs))
      if e > 0 and e % 2000 == 0:
        self.model.save('%s/models/gan_model_%d_.h5'%(BASE, e))

PS: I am new to Gans please correct me if I am doing something wrong.

like image 648
danishansari Avatar asked Mar 03 '23 10:03

danishansari


2 Answers

To shed some light on the problem, First, you need to know the parameters of flow_from_directory. batch_size determines the number of samples to be loaded for computation and the epoch determines the number of times that you what Keras to pass through all your data. In essence, if you set your epoch=2 and batch_size=32 it means that Keras will go through all your data twice with splitting your data in mini-batches with 32 samples of your data. then what's missing in your code is essentially the epoch parameter. I recommend setting the steps_per_epoch and validation_data as well. the steps_per_epoch determined the number of batches in each epoch than to visit all your samples in each epoch set the steps_per_epoch as follows.

model.fit_generator(train_generator, steps_per_epoch=train_generator.samples/train_generator.batch_size, epochs=10, validation_data=validation_generator, validation_steps=validation_generator.samples/validation_generator.batch_size)
like image 92
Mohammad Siavashi Avatar answered Mar 06 '23 00:03

Mohammad Siavashi


The flow_from_directory method is made to be used with the fit_generator function. The fit_generator function allows you to specify the number of epochs.

model.fit_generator(trn, epochs=epochs)

Where model refers to the model object you want to train. Should solve your problem. These functions are well explained in the Keras documentation

like image 31
JimmyOnThePage Avatar answered Mar 05 '23 23:03

JimmyOnThePage