Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow - Keras: Consider either turning off auto-sharding or switching the auto_shard_policy to DATA to shard this dataset

While training a model in keras / tensorflow:

The code snippet:

strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy()

I got the below error / warning:

Consider either turning off auto-sharding or switching the auto_shard_policy to DATA to shard this dataset. You can do this by creating a new `tf.data.Options()` object then setting `options.experimental_distribute.auto_shard_policy = AutoShardPolicy.DATA` before applying the options object to the dataset via `dataset.with_options(options)`.
    2020-12-16 17:12:20.885741: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:127] None of the MLIR optimization passes are enabled (registered 2)
    2020-12-16 17:12:20.905570: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 3593105000 Hz
    Epoch 1/40

Any help is appreciated.

like image 606
Amarnath R Avatar asked Dec 16 '20 11:12

Amarnath R


People also ask

How to distribute training in TensorFlow with Keras?

Easy switching between strategies. You can distribute training using tf.distribute.Strategy with a high-level API like Keras Model.fit, as well as custom training loops (and, in general, any computation using TensorFlow).

What is the difference between TensorFlow transform and keras preprocessing?

The two libraries can also be mixed, where Tensorflow Transform is used for analysis and static transformations of input data, and Keras preprocessing layers are used for train-time transformations (e.g., one-hot encoding or data augmentation).

How to turn off auto-sharding for a dataset?

Consider either turning off auto-sharding or switching the auto_shard_policy to DATA to shard this dataset. You can do this by creating a new `tf.data.Options()` object then setting `options.experimental_distribute.auto_shard_policy = AutoShardPolicy.DATA` before applying the options object to the dataset via `dataset.with_options(options)`.

How to turn off auto-sharding in TF?

Consider either turning off auto-sharding or switching the auto_shard_policy to DATA to shard this dataset. You can do this by creating a new tf.data.Options () object then setting options.experimental_distribute.auto_shard_policy = AutoShardPolicy.DATA before applying the options object to the dataset via dataset.with_options (options)`.


1 Answers

The error message here has newly arrived in tensorflow 2.4.0. While the error hints at a solution, it presupposes that your data is an object of the type tf.data.Dataset. There was previously no strict requirement to have your input data in this form (e.g. numpy arrays were fine), except now it seems to be a requirement with the distribute strategies (e.g tf.distribute.MirroredStrategy()). In any event, there does not appear to be a way to avoid tensorflow's latest console-vomit without wrapping your data in a Dataset object..

So supposing your current code looks something like this:

strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy()
with strategy.scope():
    model = ... # awesome model definition

train_x, train_y = np.array(...), np.array(...)
val_x, val_y = np.array(...), np.array(...)

batch_size = 32
model.fit(train_x, train_y, batch_size=batch_size, validation_data=(val_x, val_y))

It needs to be changed to look like this:

strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy()
with strategy.scope():
    model = ... # awesome model definition

train_x, train_y = np.array(...), np.array(...)
val_x, val_y = np.array(...), np.array(...)

# Wrap data in Dataset objects.
train_data = tf.data.Dataset.from_tensor_slices((train_x, train_y))
val_data = tf.data.Dataset.from_tensor_slices((val_x, val_y))

# The batch size must now be set on the Dataset objects.
batch_size = 32
train_data = train_data.batch(batch_size)
val_data = val_data.batch(batch_size)

# Disable AutoShard.
options = tf.data.Options()
options.experimental_distribute.auto_shard_policy = tf.data.experimental.AutoShardPolicy.OFF
train_data = train_data.with_options(options)
val_data = val_data.with_options(options)

model.fit(train_data, validation_data=val_data)

Note that if you don't set the batch size on the Dataset object, you'll get a cryptic error like this:

File "/usr/lib/python3.8/site-packages/tensorflow/python/data/experimental/ops/distribute.py", line 496, in get_static_batch_dim
    return output_shape.dims[0].value
IndexError: list index out of range
like image 179
Graham501617 Avatar answered Oct 09 '22 18:10

Graham501617