Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras Generator with Tensorflow Dataset API - IndexError: pop from empty list

I need to develop an RNN model and would like to use a data generator to feed the training/evaluation loops.

To start with, I have this help function to use when fetching data from csv file.

RECORD_DEFAULTS_TRAIN = [[0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0]]

def decode_csv(line):
   parsed_line = tf.decode_csv(line, RECORD_DEFAULTS_TRAIN)
   label =  parsed_line[-1]      # label is the last element of the list
   del parsed_line[-1]           # delete the last element from the list
   del parsed_line[0]            # even delete the first element bcz it is assumed NOT to be a feature
   features = tf.stack(parsed_line)  # Stack features so that you can later vectorize forward prop., etc.
   return features, label 

And here is my data generator function:

def data_generator(file_path_list, batch_size):

  filenames = tf.placeholder(tf.string, shape=[None])
  dataset = tf.data.Dataset.from_tensor_slices(filenames)
  dataset = dataset.flat_map(lambda filename: tf.data.TextLineDataset(filename).skip(1).map(decode_csv))
  dataset = dataset.shuffle(buffer_size=1000)
  dataset = dataset.batch(batch_size)
  iterator = dataset.make_initializable_iterator()
  next_element = iterator.get_next()

  with tf.Session() as sess:
      while True:
          sess.run(iterator.initializer, feed_dict={filenames: file_path_list})
          while True:            
              try:
                batch_data, batch_labels = sess.run(next_element)
                # Dimension of the data needs to be: (batch_size, length_of_each_sequence, nr_inputs_in_each_timestep)
                # Since the last batch in a epoch can have a different size,
                # "batch_data.shape[0]" is used instead of batch_size
                batch_data = np.reshape(batch_data, (batch_data.shape[0], SEQUENCE_LEN, 1))
              except tf.errors.OutOfRangeError:
                break
              yield (batch_data, batch_labels)

Here is how I build and train my model:

lstm_model = Sequential()
lstm_model.add(LSTM(5, input_shape=(SEQUENCE_LEN, 1), return_sequences=True))
lstm_model.add(LSTM(5, input_shape=(SEQUENCE_LEN, 1), return_sequences=False))
lstm_model.add(Dense(1))

opt = tf.keras.optimizers.Adam(lr=0.001, decay=0.0009)

lstm_model.compile(loss='mean_absolute_error', optimizer=opt, metrics=['accuracy'])

lstm_model.fit_generator(data_generator(TRAIN_FILE_PATHS, TRAIN_BATCH_SIZE), #generator, 
                         steps_per_epoch=(NR_TRAIN_EXAMPLES // TRAIN_BATCH_SIZE),
                         epochs=NR_EPOCHS, 
                         verbose=1,
                         validation_data=data_generator(DEV_FILE_PATHS, TRAIN_BATCH_SIZE),
                         validation_steps=(NR_DEV_EXAMPLES // DEV_BATCH_SIZE))

And here is how I evaluate the model on the test set:

lstm_model.evaluate_generator(data_generator(TEST_FILE_PATHS, TEST_BATCH_SIZE), 
                             steps=(NR_TEST_EXAMPLES // TEST_BATCH_SIZE), 
                             verbose=1)

Both after training and evaluating, I see the following error in the logs:

IndexError: pop from empty list

Here is the last part of the logs after the training ends:

Epoch 200/200
2/2 [==============================] - 0s 28ms/step - loss: 0.0091 - acc: 0.0120 - val_loss: 0.0128 - val_acc: 0.0000e+00
Exception ignored in: <generator object data_generator at 0x7fa97d9e34c0>
Traceback (most recent call last):
  File "<ipython-input-7-2ef5e6514df7>", line 33, in data_generator
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1530, in __exit__
    self._default_graph_context_manager.__exit__(exec_type, exec_value, exec_tb)
  File "/usr/lib/python3.6/contextlib.py", line 99, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 5025, in get_controller
    context.context().context_switches.pop()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/context.py", line 136, in pop
    self.stack.pop()
IndexError: pop from empty list
Exception ignored in: <generator object data_generator at 0x7fa97d9e3678>
Traceback (most recent call last):
  File "<ipython-input-7-2ef5e6514df7>", line 33, in data_generator
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1530, in __exit__
    self._default_graph_context_manager.__exit__(exec_type, exec_value, exec_tb)
  File "/usr/lib/python3.6/contextlib.py", line 99, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 5025, in get_controller
    context.context().context_switches.pop()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/context.py", line 136, in pop
    self.stack.pop()
IndexError: pop from empty list

<tensorflow.python.keras.callbacks.History at 0x7fa97d8b4828>

And here is what I see after running evaluate_generator():

2/2 [==============================] - 0s 28ms/step

Exception ignored in: <generator object data_generator at 0x7fa97d732af0>
Traceback (most recent call last):
  File "<ipython-input-7-2ef5e6514df7>", line 33, in data_generator
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1530, in __exit__
    self._default_graph_context_manager.__exit__(exec_type, exec_value, exec_tb)
  File "/usr/lib/python3.6/contextlib.py", line 99, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 5025, in get_controller
    context.context().context_switches.pop()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/context.py", line 136, in pop
    self.stack.pop()
IndexError: pop from empty list

[0.008863004390150309, 0.0]

The thing I am confused about is why I see the error message IndexError: pop from empty list in every case above? Is it normal? Or am I doing something wrong?

like image 744
edn Avatar asked Jan 28 '23 11:01

edn


1 Answers

Solved. I want to explain the issue rather than deleting my post, so that it maybe can help other people as well..

I will only give the example for the evaluate_generator(...) function. This is how I was calling the function..

lstm_model.evaluate_generator(data_generator(TEST_FILE_PATHS, TEST_BATCH_SIZE), 
                             steps=(NR_TEST_EXAMPLES // TEST_BATCH_SIZE), 
                             verbose=1)

And I changed it as following:

test_data_generator = data_generator(TEST_FILE_PATHS, TEST_BATCH_SIZE)
lstm_model.evaluate_generator(test_data_generator, 
                              steps=(NR_TEST_EXAMPLES // TEST_BATCH_SIZE), 
                              verbose=1)

And the problem is solved. I saw both ways of usage at different places, even if each information one finds on the net is not necessarily true. It is not clear for me either why it is solved when changing the code above. If anyone knows, I would be happy to hear the explanation.

like image 165
edn Avatar answered Jan 31 '23 18:01

edn