Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you save a Tensorflow dataset to a file?

There are at least two more questions like this on SO but not a single one has been answered.

I have a dataset of the form:

<TensorSliceDataset shapes: ((512,), (512,), (512,), ()), types: (tf.int32, tf.int32, tf.int32, tf.int32)>

and another of the form:

<BatchDataset shapes: ((None, 512), (None, 512), (None, 512), (None,)), types: (tf.int32, tf.int32, tf.int32, tf.int32)>

I have looked and looked but I can't find the code to save these datasets to files that can be loaded later. The closest I got was this page in the TensorFlow docs, which suggests serializing the tensors using tf.io.serialize_tensor and then writing them to a file using tf.data.experimental.TFRecordWriter.

However, when I tried this using the code:

dataset.map(tf.io.serialize_tensor)
writer = tf.data.experimental.TFRecordWriter('mydata.tfrecord')
writer.write(dataset)

I get an error on the first line:

TypeError: serialize_tensor() takes from 1 to 2 positional arguments but 4 were given

How can I modify the above (or do something else) to accomplish my goal?

like image 334
Vivek Subramanian Avatar asked May 11 '20 01:05

Vivek Subramanian


People also ask

How do I save a TensorFlow tensor?

One way would be to do a. numpy(). save('file. npy') then converting back to a tensor after loading.

Where are TensorFlow datasets stored?

Normally when you use TensorFlow Datasets, the downloaded and prepared data will be cached in a local directory (by default ~/tensorflow_datasets ).


1 Answers

An incident was open on GitHUb and it appears there's a new feature available in TF 2.3 to write to disk :

https://www.tensorflow.org/api_docs/python/tf/data/experimental/save https://www.tensorflow.org/api_docs/python/tf/data/experimental/load

I haven't tested this features yet but it seems to be doing what you want.

like image 118
Yoan B. M.Sc Avatar answered Nov 17 '22 10:11

Yoan B. M.Sc