I am trying to train a seq2seq
model for language translation, and I am copy-pasting code from this Kaggle Notebook on Google Colab. The code is working fine with CPU and GPU, but it is giving me errors while training on a TPU. This same question has been already asked here.
Here is my code:
strategy = tf.distribute.experimental.TPUStrategy(resolver)
with strategy.scope():
model = create_model()
model.compile(optimizer = 'rmsprop', loss = 'categorical_crossentropy')
model.fit_generator(generator = generate_batch(X_train, y_train, batch_size = batch_size),
steps_per_epoch = train_samples // batch_size,
epochs = epochs,
validation_data = generate_batch(X_test, y_test, batch_size = batch_size),
validation_steps = val_samples // batch_size)
Traceback:
Epoch 1/2
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-60-940fe0ee3c8b> in <module>()
3 epochs = epochs,
4 validation_data = generate_batch(X_test, y_test, batch_size = batch_size),
----> 5 validation_steps = val_samples // batch_size)
10 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
992 except Exception as e: # pylint:disable=broad-except
993 if hasattr(e, "ag_error_metadata"):
--> 994 raise e.ag_error_metadata.to_exception(e)
995 else:
996 raise
ValueError: in user code:
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:853 train_function *
return step_function(self, iterator)
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:842 step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
...
ValueError: None values not supported.
I couldn't figure out the error, and I think the error is because of this generate_batch
function:
X, y = lines['english_sentence'], lines['hindi_sentence']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 34)
def generate_batch(X = X_train, y = y_train, batch_size = 128):
while True:
for j in range(0, len(X), batch_size):
encoder_input_data = np.zeros((batch_size, max_length_src), dtype='float32')
decoder_input_data = np.zeros((batch_size, max_length_tar), dtype='float32')
decoder_target_data = np.zeros((batch_size, max_length_tar, num_decoder_tokens), dtype='float32')
for i, (input_text, target_text) in enumerate(zip(X[j:j + batch_size], y[j:j + batch_size])):
for t, word in enumerate(input_text.split()):
encoder_input_data[i, t] = input_token_index[word]
for t, word in enumerate(target_text.split()):
if t<len(target_text.split())-1:
decoder_input_data[i, t] = target_token_index[word]
if t>0:
decoder_target_data[i, t - 1, target_token_index[word]] = 1.
yield([encoder_input_data, decoder_input_data], decoder_target_data)
My Colab notebook - here
Kaggle dataset - here
TensorFlow version - 2.6
Edit - Please don't tell me to down-grade TensorFlow/Keras version to 1.x
. I can down-grade it to TensorFlow 2.0, 2.1, 2.3
but not 1.x
. I don't understand TensorFlow 1.x
. Also, there is no point in using a 3-year-old version.
As stated in the referenced answer in the link you provided, tensorflow.data
API works better with TPUs. In order to adapt it in your case, try to use return
instead of yield
in generate_batch
function:
def generate_batch(X = X_train, y = y_train, batch_size = 128):
...
return encoder_input_data, decoder_input_data, decoder_target_dat
encoder_input_data, decoder_input_data, decoder_target_data = generate_batch(X_train, y_train, batch_size=128)
And then use tensorflow.data
to structure your data:
from tensorflow.data import Dataset
encoder_input_data = Dataset.from_tensor_slices(encoder_input_data)
decoder_input_data = Dataset.from_tensor_slices(decoder_input_data)
decoder_target_data = Dataset.from_tensor_slices(decoder_target_data)
ds = Dataset.zip((encoder_input_data, decoder_input_data, decoder_target_data)).map(map_fn).batch(1024)
where map_fn
is defined by:
def map_fn(encoder_input ,decoder_input, decoder_target):
return (encoder_input ,decoder_input), decoder_target
And finally use Model.fit
instead of Model.fit_generator
:
model.fit(x=ds, epochs=epochs)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With