How do you save a Tensorflow dataset to a file?

Tags:

There are at least two more questions like this on SO but not a single one has been answered.

I have a dataset of the form:

<TensorSliceDataset shapes: ((512,), (512,), (512,), ()), types: (tf.int32, tf.int32, tf.int32, tf.int32)>

and another of the form:

<BatchDataset shapes: ((None, 512), (None, 512), (None, 512), (None,)), types: (tf.int32, tf.int32, tf.int32, tf.int32)>

I have looked and looked but I can't find the code to save these datasets to files that can be loaded later. The closest I got was this page in the TensorFlow docs, which suggests serializing the tensors using tf.io.serialize_tensor and then writing them to a file using tf.data.experimental.TFRecordWriter.

However, when I tried this using the code:

dataset.map(tf.io.serialize_tensor)
writer = tf.data.experimental.TFRecordWriter('mydata.tfrecord')
writer.write(dataset)

I get an error on the first line:

TypeError: serialize_tensor() takes from 1 to 2 positional arguments but 4 were given

How can I modify the above (or do something else) to accomplish my goal?

334

asked May 11 '20 01:05

Vivek Subramanian

1 Answers

An incident was open on GitHUb and it appears there's a new feature available in TF 2.3 to write to disk :

https://www.tensorflow.org/api_docs/python/tf/data/experimental/save https://www.tensorflow.org/api_docs/python/tf/data/experimental/load

I haven't tested this features yet but it seems to be doing what you want.

118

answered Nov 17 '22 10:11

Yoan B. M.Sc

Related questions
                            
                                Pythons Console Module has made it impossible to type the tab key
                            
                                How to re-assign a variable in python without changing its id?
                            
                                Iterating over rows in pandas to check the condition
                            
                                PySpark Will not start - ‘python’: No such file or directory
                            
                                Python - How to - Big Query asynchronous tasks
                            
                                Pandas to_sql changing datatype in database table
                            
                                Filtering on action decorator - Django Rest Framework
                            
                                Python & Selenium: Difference between driver.implicitly_wait() and time.sleep()
                            
                                add string to every element of a python list
                            
                                How to create a custom mixin in django?
                            
                                Pandas - How to extract HH:MM from datetime column in Python?
                            
                                Return a Pandas DataFrame as a data_table from a callback with Plotly Dash for Python
                            
                                Python TypeError: sort() takes no positional arguments
                            
                                No module named 'cv2.cv2'
                            
                                Cycle over list indefinitely
                            
                                airflow webserver started but UI doesn't show in browser
                            
                                What does Import Error: Symbol not found: _PQencryptPasswordConn mean and how do I fix it?
                            
                                Install python 2.7 on ubuntu 18.04
                            
                                WARNING: Failed to generate report: No data to report error in python using pytest module
                            
                                How to count consecutive repetitions of a substring in a string?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How do you save a Tensorflow dataset to a file?

Tags:

python

serialization

tensorflow

tensorflow-datasets

Vivek Subramanian

People also ask

1 Answers

Yoan B. M.Sc

Recent Activity

Donate For Us