Training model with fit_generator does not show val_loss and val_acc and interrupted at first epoch

Question

I implemented a data generator to split my training data into mini batches of 256 to avoid memory errors. Its running on the training data but it does not show validation loss and validation accuracy at the end of each epochs. I also applied the data generator on validation data and defined validation steps. I don't know exactly what's the wrong with the code that its's not showing validation loss and accuracy? here is the code:

early_stopping_cb=tf.keras.callbacks.EarlyStopping(patience=3,restore_best_weights=True)
batch_size=256
epoch_steps=math.ceil(len(utt)/ batch_size)
val_steps=math.ceil(len(val_prev)/ batch_size)

hist = model.fit_generator(generate_data(utt_minus_one, utt, y_train, batch_size),
                steps_per_epoch=epoch_steps, epochs=3,
                callbacks = [early_stopping_cb],
                validation_data=generate_data(val_prev, val_curr,y_val,batch_size),
                validation_steps=val_steps,  class_weight=custom_weight_dict,
                 verbose=1)

here is code for generator:

#method to use generator to split data into mini batches of 256 each loaded at run time
def generate_data(X1,X2,Y,batch_size):
  p_input=[]
  c_input=[]
  target=[]
  batch_count=0
  for i in range(len(X1)):
    p_input.append(X1[i])
    c_input.append(X2[i])
    target.append(Y[i])
    batch_count+=1
    if batch_count>batch_size:
      prev_X=np.array(p_input,dtype=np.int64)
      cur_X=np.array(c_input,dtype=np.int64)
      cur_y=np.array(target,dtype=np.int32)
      yield ([prev_X,cur_X],cur_y ) 
      p_input=[]
      c_input=[]
      target=[]
      batch_count=0
  return

Here is the trace for first epoch which also gives an error:

Epoch 1/3
346/348 [============================>.] - ETA: 4s - batch: 172.5000 - size: 257.0000 - loss: 0.8972 - accuracy: 0.8424WARNING:tensorflow:Your dataset iterator ran out of data; interrupting training. Make sure that your iterator can generate at least `steps_per_epoch * epochs` batches (in this case, 1044 batches). You may need touse the repeat() function when building your dataset.
WARNING:tensorflow:Early stopping conditioned on metric `val_loss` which is not available. Available metrics are: loss,accuracy
346/348 [============================>.] - 858s 2s/step - batch: 172.5000 - size: 257.0000 - loss: 0.8972 - accuracy: 0.8424

Can any one help in sorting out these issues?

Aizayousaf · Accepted Answer

There is need of a one while loop for per epoch over the for loop to split into mini batches. So if there are 348 batches per epochs then 3*348= 1044 batches overall.


#method to use generator to split data into mini batches of 256 each loaded at run time
def generate_data(X1,X2,Y,batch_size):
  count=0
  p_input=[]
  c_input=[]
  target=[]
  batch_count=0
  while True:
    for i in range(len(X1)):
      p_input.append(X1[i])
      c_input.append(X2[i])
      target.append(Y[i])
      batch_count+=1
      if batch_count>batch_size:
        count=count+1
        prev_X=np.array(p_input,dtype=np.int64)
        cur_X=np.array(c_input,dtype=np.int64)
        cur_y=np.array(target,dtype=np.int32)
        yield ([prev_X,cur_X],cur_y ) 
        p_input=[]
        c_input=[]
        target=[]
        batch_count=0
    print(count)
  return

And trace for first epoch:

Epoch 1/3
335/347 [===========================>..] - ETA: 30s - batch: 167.0000 - size: 257.0000 - loss: 1.2734 - accuracy: 0.8105346
347/347 [==============================] - ETA: 0s - batch: 173.0000 - size: 257.0000 - loss: 1.2635 - accuracy: 0.8113WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_v1.py:2048: Model.state_updates (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.
86
347/347 [==============================] - 964s 3s/step - batch: 173.0000 - size: 257.0000 - loss: 1.2635 - accuracy: 0.8113 - val_loss: 0.5700 - val_accuracy: 0.8367

Training model with fit_generator does not show val_loss and val_acc and interrupted at first epoch

Tags:

python

machine-learning

tensorflow

deep-learning

keras

Aizayousaf

1 Answers

Aizayousaf

Recent Activity

Donate For Us

Training model with fit_generator does not show val_loss and val_acc and interrupted at first epoch

Tags:

python

machine-learning

tensorflow

deep-learning

keras

Aizayousaf

1 Answers

Aizayousaf

Related questions

Recent Activity

Donate For Us