tf.data with multiple inputs / outputs in Keras

For the application, such as pair text similarity, the input data is similar to: pair_1, pair_2. In these problems, we usually have multiple input data. Previously, I implemented my models successfully:

model.fit([pair_1, pair_2], labels, epochs=50)

I decided to replace my input pipeline with tf.data API. To this end, I create a Dataset similar to:

dataset = tf.data.Dataset.from_tensor_slices((pair_1, pair2, labels))

It compiles successfully but when start to train it throws the following exception:

AttributeError: 'tuple' object has no attribute 'ndim'

My Keras and Tensorflow version respectively are 2.1.6 and 1.11.0. I found a similar issue in Tensorflow repository: tf.keras multi-input models don't work when using tf.data.Dataset.

Does anyone know how to fix the issue?

Here is some main part of the code:

(q1_test, q2_test, label_test) = test (q1_train, q2_train, label_train) = train      def tfdata_generator(sent1, sent2, labels, is_training):         '''Construct a data generator using tf.Dataset'''          dataset = tf.data.Dataset.from_tensor_slices((sent1, sent2, labels))         if is_training:             dataset = dataset.shuffle(1000)  # depends on sample size          dataset = dataset.repeat()         dataset = dataset.prefetch(tf.contrib.data.AUTOTUNE)          return dataset  train_dataset = tfdata_generator(q1_train, q2_train, label_train, is_training=True, batch_size=_BATCH_SIZE) test_dataset = tfdata_generator(q1_test, q2_test, label_test, is_training=False, batch_size=_BATCH_SIZE)   inps1 = keras.layers.Input(shape=(50,)) inps2 = keras.layers.Input(shape=(50,))  embed = keras.layers.Embedding(input_dim=nb_vocab, output_dim=300, weights=[embedding], trainable=False) embed1 = embed(inps1) embed2 = embed(inps2)  gru = keras.layers.CuDNNGRU(256) gru1 = gru(embed1) gru2 = gru(embed2)  concat = keras.layers.concatenate([gru1, gru2])  preds = keras.layers.Dense(1, 'sigmoid')(concat)  model = keras.models.Model(inputs=[inps1, inps2], outputs=preds) model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) print(model.summary())  model.fit(     train_dataset.make_one_shot_iterator(),     steps_per_epoch=len(q1_train) // _BATCH_SIZE,     epochs=50,     validation_data=test_dataset.make_one_shot_iterator(),     validation_steps=len(q1_test) // _BATCH_SIZE,     verbose=1)

Does TF keras sequential support multiple inputs?

Keras is able to handle multiple inputs (and even multiple outputs) via its functional API. Learn more about 3 ways to create a Keras model with TensorFlow 2.0 (Sequential, Functional, and Model Subclassing).

What does TF data dataset From_tensor_slices do?

With that knowledge, from_tensors makes a dataset where each input tensor is like a row of your dataset, and from_tensor_slices makes a dataset where each input tensor is column of your data; so in the latter case all tensors must be the same length, and the elements (rows) of the resulting dataset are tuples with one ...

Does TF data use GPU?

If a TensorFlow operation has both CPU and GPU implementations, by default, the GPU device is prioritized when the operation is assigned. For example, tf. matmul has both CPU and GPU kernels and on a system with devices CPU:0 and GPU:0 , the GPU:0 device is selected to run tf.

What is TF prefetch?

prefetch transformation. It can be used to decouple the time when data is produced from the time when data is consumed. In particular, the transformation uses a background thread and an internal buffer to prefetch elements from the input dataset ahead of the time they are requested.

I'm not using Keras but I would go with an tf.data.Dataset.from_generator() - like:

def _input_fn():   sent1 = np.array([1, 2, 3, 4, 5, 6, 7, 8], dtype=np.int64)   sent2 = np.array([20, 25, 35, 40, 600, 30, 20, 30], dtype=np.int64)   sent1 = np.reshape(sent1, (8, 1, 1))   sent2 = np.reshape(sent2, (8, 1, 1))    labels = np.array([40, 30, 20, 10, 80, 70, 50, 60], dtype=np.int64)   labels = np.reshape(labels, (8, 1))    def generator():     for s1, s2, l in zip(sent1, sent2, labels):       yield {"input_1": s1, "input_2": s2}, l    dataset = tf.data.Dataset.from_generator(generator, output_types=({"input_1": tf.int64, "input_2": tf.int64}, tf.int64))   dataset = dataset.batch(2)   return dataset  ...  model.fit(_input_fn(), epochs=10, steps_per_epoch=4)

This generator can iterate over your e.g text-files / numpy arrays and yield on every call a example. In this example, I assume that the word of the sentences are already converted to the indices in the vocabulary.

Edit: Since OP asked, it should be also possible with Dataset.from_tensor_slices():

def _input_fn():   sent1 = np.array([1, 2, 3, 4, 5, 6, 7, 8], dtype=np.int64)   sent2 = np.array([20, 25, 35, 40, 600, 30, 20, 30], dtype=np.int64)   sent1 = np.reshape(sent1, (8, 1))   sent2 = np.reshape(sent2, (8, 1))    labels = np.array([40, 30, 20, 10, 80, 70, 50, 60], dtype=np.int64)   labels = np.reshape(labels, (8))    dataset = tf.data.Dataset.from_tensor_slices(({"input_1": sent1, "input_2": sent2}, labels))   dataset = dataset.batch(2, drop_remainder=True)   return dataset

One way to solve your issue could be to use the zip dataset to combine your various inputs:

sent1 = np.array([1, 2, 3, 4, 5, 6, 7, 8], dtype=np.float32) sent2 = np.array([20, 25, 35, 40, 600, 30, 20, 30], dtype=np.float32) sent1 = np.reshape(sent1, (8, 1, 1)) sent2 = np.reshape(sent2, (8, 1, 1))  labels = np.array([40, 30, 20, 10, 80, 70, 50, 60], dtype=np.float32) labels = np.reshape(labels, (8, 1))  dataset_12 = tf.data.Dataset.from_tensor_slices((sent_1, sent_2)) dataset_label = tf.data.Dataset.from_tensor_slices(labels)  dataset = tf.data.Dataset.zip((dataset_12, dataset_label)).batch(2).repeat() model.fit(dataset, epochs=10, steps_per_epoch=4)

will print: Epoch 1/10 4/4 [==============================] - 2s 503ms/step...

tf.data with multiple inputs / outputs in Keras

Tags:

tensorflow

keras

tensorflow-datasets

Amir

People also ask

2 Answers

lhlmgr

pfm

Recent Activity

Donate For Us

tf.data with multiple inputs / outputs in Keras

Tags:

tensorflow

keras

tensorflow-datasets

Amir

People also ask

2 Answers

lhlmgr

pfm

Related questions

Recent Activity

Donate For Us