Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Training model with fit_generator does not show val_loss and val_acc and interrupted at first epoch

I implemented a data generator to split my training data into mini batches of 256 to avoid memory errors. Its running on the training data but it does not show validation loss and validation accuracy at the end of each epochs. I also applied the data generator on validation data and defined validation steps. I don't know exactly what's the wrong with the code that its's not showing validation loss and accuracy? here is the code:

early_stopping_cb=tf.keras.callbacks.EarlyStopping(patience=3,restore_best_weights=True)
batch_size=256
epoch_steps=math.ceil(len(utt)/ batch_size)
val_steps=math.ceil(len(val_prev)/ batch_size)

hist = model.fit_generator(generate_data(utt_minus_one, utt, y_train, batch_size),
                steps_per_epoch=epoch_steps, epochs=3,
                callbacks = [early_stopping_cb],
                validation_data=generate_data(val_prev, val_curr,y_val,batch_size),
                validation_steps=val_steps,  class_weight=custom_weight_dict,
                 verbose=1)

here is code for generator:

#method to use generator to split data into mini batches of 256 each loaded at run time
def generate_data(X1,X2,Y,batch_size):
  p_input=[]
  c_input=[]
  target=[]
  batch_count=0
  for i in range(len(X1)):
    p_input.append(X1[i])
    c_input.append(X2[i])
    target.append(Y[i])
    batch_count+=1
    if batch_count>batch_size:
      prev_X=np.array(p_input,dtype=np.int64)
      cur_X=np.array(c_input,dtype=np.int64)
      cur_y=np.array(target,dtype=np.int32)
      yield ([prev_X,cur_X],cur_y ) 
      p_input=[]
      c_input=[]
      target=[]
      batch_count=0
  return

Here is the trace for first epoch which also gives an error:

Epoch 1/3
346/348 [============================>.] - ETA: 4s - batch: 172.5000 - size: 257.0000 - loss: 0.8972 - accuracy: 0.8424WARNING:tensorflow:Your dataset iterator ran out of data; interrupting training. Make sure that your iterator can generate at least `steps_per_epoch * epochs` batches (in this case, 1044 batches). You may need touse the repeat() function when building your dataset.
WARNING:tensorflow:Early stopping conditioned on metric `val_loss` which is not available. Available metrics are: loss,accuracy
346/348 [============================>.] - 858s 2s/step - batch: 172.5000 - size: 257.0000 - loss: 0.8972 - accuracy: 0.8424

Can any one help in sorting out these issues?

like image 565
Aizayousaf Avatar asked Feb 01 '26 19:02

Aizayousaf


1 Answers

There is need of a one while loop for per epoch over the for loop to split into mini batches. So if there are 348 batches per epochs then 3*348= 1044 batches overall.


#method to use generator to split data into mini batches of 256 each loaded at run time
def generate_data(X1,X2,Y,batch_size):
  count=0
  p_input=[]
  c_input=[]
  target=[]
  batch_count=0
  while True:
    for i in range(len(X1)):
      p_input.append(X1[i])
      c_input.append(X2[i])
      target.append(Y[i])
      batch_count+=1
      if batch_count>batch_size:
        count=count+1
        prev_X=np.array(p_input,dtype=np.int64)
        cur_X=np.array(c_input,dtype=np.int64)
        cur_y=np.array(target,dtype=np.int32)
        yield ([prev_X,cur_X],cur_y ) 
        p_input=[]
        c_input=[]
        target=[]
        batch_count=0
    print(count)
  return

And trace for first epoch:

Epoch 1/3
335/347 [===========================>..] - ETA: 30s - batch: 167.0000 - size: 257.0000 - loss: 1.2734 - accuracy: 0.8105346
347/347 [==============================] - ETA: 0s - batch: 173.0000 - size: 257.0000 - loss: 1.2635 - accuracy: 0.8113WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_v1.py:2048: Model.state_updates (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.
86
347/347 [==============================] - 964s 3s/step - batch: 173.0000 - size: 257.0000 - loss: 1.2635 - accuracy: 0.8113 - val_loss: 0.5700 - val_accuracy: 0.8367
like image 94
Aizayousaf Avatar answered Feb 04 '26 10:02

Aizayousaf



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!