tf.data.Dataset: The `batch_size` argument must not be specified for the given input type

Q: What is TF data dataset?

TensorFlow Datasets is a collection of datasets ready to use, with TensorFlow or other Python ML frameworks, such as Jax. All datasets are exposed as tf. data. Datasets , enabling easy-to-use and high-performance input pipelines. To get started see the guide and our list of datasets.

Q: How do you get the shape of a TF dataset?

To get the shape of a tensor, you can easily use the tf. shape() function. This method will help the user to return the shape of the given tensor.

Q: What is a prefetch dataset?

Dataset. prefetch transformation. It can be used to decouple the time when data is produced from the time when data is consumed. In particular, the transformation uses a background thread and an internal buffer to prefetch elements from the input dataset ahead of the time they are requested.

Tags:

tensorflow

google-colaboratory

keras

google-cloud-tpu

talos

I'm using Talos and Google colab TPU to run hyperparameter tuning of a Keras model. Note that I'm using Tensorflow 1.15.0 and Keras 2.2.4-tf.

import os
import tensorflow as tf
import talos as ta
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
from sklearn.model_selection import train_test_split

def iris_model(x_train, y_train, x_val, y_val, params):

    # Specify a distributed strategy to use TPU
    resolver = tf.contrib.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
    tf.contrib.distribute.initialize_tpu_system(resolver)
    strategy = tf.contrib.distribute.TPUStrategy(resolver)

    # Use the strategy to create and compile a Keras model
    with strategy.scope():
      model = Sequential()
      model.add(Dense(32, input_shape=(4,), activation=tf.nn.relu, name="relu"))
      model.add(Dense(3, activation=tf.nn.softmax, name="softmax"))
      model.compile(optimizer=Adam(learning_rate=0.1), loss=params['losses'])

    # Convert data type to use TPU
    x_train = x_train.astype('float32')
    x_val = x_val.astype('float32')

    dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
    dataset = dataset.cache()
    dataset = dataset.shuffle(1000, reshuffle_each_iteration=True).repeat()
    dataset = dataset.batch(params['batch_size'], drop_remainder=True)

    # Fit the Keras model on the dataset
    out = model.fit(dataset, batch_size=params['batch_size'], epochs=params['epochs'], validation_data=[x_val, y_val], verbose=0, steps_per_epoch=2)

    return out, model

# Load dataset
X, y = ta.templates.datasets.iris()

# Train and test set
x_train, x_val, y_train, y_val = train_test_split(X, y, test_size=0.30, shuffle=False)

# Create a hyperparameter distributions 
p = {'losses': ['logcosh'], 'batch_size': [128, 256, 384, 512, 1024], 'epochs': [10, 20]}

# Use Talos to scan the best hyperparameters of the Keras model
scan_object = ta.Scan(x_train, y_train, params=p, model=iris_model, experiment_name='test', x_val=x_val, y_val=y_val, fraction_limit=0.1)

After converting the train set to a Dataset using tf.data.Dataset, I get the following error when fitting the model with out = model.fit:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-c812209b95d0> in <module>()
      8 
      9 # Use Talos to scan the best hyperparameters of the Keras model
---> 10 scan_object = ta.Scan(x_train, y_train, params=p, model=iris_model, experiment_name='test', x_val=x_val, y_val=y_val, fraction_limit=0.1)

8 frames
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training.py in _validate_or_infer_batch_size(self, batch_size, steps, x)
   1813             'The `batch_size` argument must not be specified for the given '
   1814             'input type. Received input: {}, batch_size: {}'.format(
-> 1815                 x, batch_size))
   1816       return
   1817 

ValueError: The `batch_size` argument must not be specified for the given input type. Received input: <DatasetV1Adapter shapes: ((512, 4), (512, 3)), types: (tf.float32, tf.float32)>, batch_size: 512

Then, if I follow those instructions and don't set the batch-size argument to model.fit. I get another error in:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-c812209b95d0> in <module>()
      8 
      9 # Use Talos to scan the best hyperparameters of the Keras model
---> 10 scan_object = ta.Scan(x_train, y_train, params=p, model=iris_model, experiment_name='test', x_val=x_val, y_val=y_val, fraction_limit=0.1)

8 frames
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training.py in _distribution_standardize_user_data(self, x, y, sample_weight, class_weight, batch_size, validation_split, shuffle, epochs, allow_partial_batch)
   2307             strategy) and not drop_remainder:
   2308           dataset_size = first_x_value.shape[0]
-> 2309           if dataset_size % batch_size == 0:
   2310             drop_remainder = True
   2311 

TypeError: unsupported operand type(s) for %: 'int' and 'NoneType'

580

asked Nov 20 '19 15:11

Sami Belkacem

Video Answer

1 Answers

It looks to me that the problem with your code is that the training and validation data is not in the same format. You are batching the training data but not the validation examples.

You can ensure that they are in the same format by replacing the bottom half of your iris_model function with this:

def fix_data(x, y):
    x = x.astype('float32')
    ds = Dataset.from_tensor_slices((x, y))
    ds = ds.cache()
    ds = ds.shuffle(1000, reshuffle_each_iteration = True)
    ds = ds.repeat()
    ds = ds.batch(params['batch_size'], drop_remainder = True)
    return ds
train = fix_data(x_train, y_train)
val = fix_data(x_val, y_val)

# Fit the Keras model on the dataset
out = model.fit(x = train, epochs = params['epochs'],
                steps_per_epoch = 2,
                validation_data = val,
                validation_steps = 2)

At least this works for me and your code runs without error.

answered Oct 11 '22 11:10

Björn Lindqvist

Related questions
                            
                                Training and Loss not changing in Keras CNN model
                            
                                ImportError: No module named 'nets'
                            
                                Efficiently Finding Closest Word In TensorFlow Embedding
                            
                                What does "relu" stand for in tf.nn.relu?
                            
                                how to convert logits to probability in binary classification in tensorflow?
                            
                                How does tensorflow batch_matmul work?
                            
                                Why training speed does not scale with the batch size?
                            
                                How to print value of tensorflow.python.framework.ops.Tensor in Tensorflow 2.0?
                            
                                TensorFlow: Performing this loss computation
                            
                                Multi GPU Training in Tensorflow (Data Parallelism) when Using feed_dict
                            
                                How to use py_func with a function that returns dict
                            
                                Using Custom vision exported model with tensorflow JS and input an image
                            
                                Most scalable way for using generators with tf.data ? tf.data guide says `from_generator` has limited scalability
                            
                                Predicting next word using the language model tensorflow example
                            
                                TensorFlow: getting all states from a RNN

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With