Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ValueError: None values not supported. Code working properly on CPU/GPU but not on TPU

I am trying to train a seq2seq model for language translation, and I am copy-pasting code from this Kaggle Notebook on Google Colab. The code is working fine with CPU and GPU, but it is giving me errors while training on a TPU. This same question has been already asked here.

Here is my code:

    strategy = tf.distribute.experimental.TPUStrategy(resolver)
    
    with strategy.scope():
      model = create_model()
      model.compile(optimizer = 'rmsprop', loss = 'categorical_crossentropy')
    
    model.fit_generator(generator = generate_batch(X_train, y_train, batch_size = batch_size),
                        steps_per_epoch = train_samples // batch_size,
                        epochs = epochs,
                        validation_data = generate_batch(X_test, y_test, batch_size = batch_size),
                        validation_steps = val_samples // batch_size)

Traceback:

Epoch 1/2
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-60-940fe0ee3c8b> in <module>()
      3                     epochs = epochs,
      4                     validation_data = generate_batch(X_test, y_test, batch_size = batch_size),
----> 5                     validation_steps = val_samples // batch_size)

10 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
    992           except Exception as e:  # pylint:disable=broad-except
    993             if hasattr(e, "ag_error_metadata"):
--> 994               raise e.ag_error_metadata.to_exception(e)
    995             else:
    996               raise

ValueError: in user code:
    /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:853 train_function  *
    return step_function(self, iterator)
    /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:842 step_function  **
    outputs = model.distribute_strategy.run(run_step, args=(data,))
...
ValueError: None values not supported.

I couldn't figure out the error, and I think the error is because of this generate_batch function:

X, y = lines['english_sentence'], lines['hindi_sentence']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 34)

def generate_batch(X = X_train, y = y_train, batch_size = 128):
    while True:
        for j in range(0, len(X), batch_size):
 
            encoder_input_data = np.zeros((batch_size, max_length_src), dtype='float32')
            decoder_input_data = np.zeros((batch_size, max_length_tar), dtype='float32')
            decoder_target_data = np.zeros((batch_size, max_length_tar, num_decoder_tokens), dtype='float32')
            
            for i, (input_text, target_text) in enumerate(zip(X[j:j + batch_size], y[j:j + batch_size])):
                for t, word in enumerate(input_text.split()):
                    encoder_input_data[i, t] = input_token_index[word]
                for t, word in enumerate(target_text.split()):
                    if t<len(target_text.split())-1:
                        decoder_input_data[i, t] = target_token_index[word]
                    if t>0:

                        decoder_target_data[i, t - 1, target_token_index[word]] = 1.
            yield([encoder_input_data, decoder_input_data], decoder_target_data)

My Colab notebook - here
Kaggle dataset - here
TensorFlow version - 2.6

Edit - Please don't tell me to down-grade TensorFlow/Keras version to 1.x. I can down-grade it to TensorFlow 2.0, 2.1, 2.3 but not 1.x. I don't understand TensorFlow 1.x. Also, there is no point in using a 3-year-old version.

like image 546
Adarsh Wase Avatar asked Oct 28 '21 10:10

Adarsh Wase


1 Answers

As stated in the referenced answer in the link you provided, tensorflow.data API works better with TPUs. In order to adapt it in your case, try to use return instead of yield in generate_batch function:

def generate_batch(X = X_train, y = y_train, batch_size = 128):
    ...
    return encoder_input_data, decoder_input_data, decoder_target_dat

encoder_input_data, decoder_input_data, decoder_target_data = generate_batch(X_train, y_train, batch_size=128)

And then use tensorflow.data to structure your data:

from tensorflow.data import Dataset

encoder_input_data = Dataset.from_tensor_slices(encoder_input_data)
decoder_input_data = Dataset.from_tensor_slices(decoder_input_data)
decoder_target_data = Dataset.from_tensor_slices(decoder_target_data)
ds = Dataset.zip((encoder_input_data, decoder_input_data, decoder_target_data)).map(map_fn).batch(1024)

where map_fn is defined by:

def map_fn(encoder_input ,decoder_input, decoder_target):
    return (encoder_input ,decoder_input), decoder_target

And finally use Model.fit instead of Model.fit_generator:

model.fit(x=ds, epochs=epochs)
like image 103
R. Marolahy Avatar answered Nov 15 '22 14:11

R. Marolahy