Tensorflow 1.10 TFRecordDataset - recovering TFRecords

Tags:

Notes:

this question extends upon a previous question of mine. In that question I ask about the best way to store some dummy data as Example and SequenceExample seeking to know which is better for data similar to dummy data provided. I provide both explicit formulations of the Example and SequenceExample construction as well as, in the answers, a programatic way to do so.
Because this is still a lot of code, I am providing a Colab (interactive jupyter notebook hosted by google) file where you can try the code out yourself to assist. All the necessary code is there and it is generously commented.

I am trying to learn how to convert my data into TF Records as the claimed benefits are worthwhile for my data. However, the documentation leaves a lot to be desired and the tutorials / blogs (that I have seen) which try to go deeper, really only touch the surface or rehash the sparse docs that exist.

For the demo data considered in my previous question - as well as here - I have written a decent class that takes:

a sequence with n channels (in this example it is integer based, of fixed-length and with n channels)
soft-labeled class probabilities (in this example there are n classes and float based)
some meta data (in this example a string and two floats)

and can encode the data in 1 of 6 forms:

Example, with sequence channels / classes separate in a numeric type (int64 in this case) with meta data tacked on
Example, with sequence channels / classes separate as a byte string (via numpy.ndarray.tostring()) with meta data tacked on
Example, with sequence / classes dumped as byte string with meta data tacked on
SequenceExample, with sequence channels / classes separate in a numeric type and meta data as context
SequenceExample, with sequence channels separate as a byte string and meta data as context
SequenceExample, with sequence and classes dumped as byte string and meta data as context

This works fine.

In the Colab I show how to write dummy data all in the same file as well as in separate files.

My question is how can I recover this data?

I given 4 attempts at trying to do so in the linked file.

Why is TFReader under a different sub-package from TFWriter?

217

asked Aug 28 '18 19:08

SumNeuron

Video Answer

1 Answers

Solved by updating the features to include shape information and remembering that SequenceExample are unnamed FeatureLists.

context_features = {
    'Name' : tf.FixedLenFeature([], dtype=tf.string),
    'Val_1': tf.FixedLenFeature([], dtype=tf.float32),
    'Val_2': tf.FixedLenFeature([], dtype=tf.float32)
}

sequence_features = {
    'sequence': tf.FixedLenSequenceFeature((3,), dtype=tf.int64),
    'pclasses'  : tf.FixedLenSequenceFeature((3,), dtype=tf.float32),
}

def parse(record):
  parsed = tf.parse_single_sequence_example(
        record,
        context_features=context_features,
        sequence_features=sequence_features
  )
  return parsed


filenames = [os.path.join(os.getcwd(),f"dummy_sequences_{i}.tfrecords") for i in range(3)]
dataset = tf.data.TFRecordDataset(filenames).map(lambda r: parse(r))

iterator = tf.data.Iterator.from_structure(dataset.output_types,
                                           dataset.output_shapes)
next_element = iterator.get_next()

training_init_op = iterator.make_initializer(dataset)

for _ in range(2):
  # Initialize an iterator over the training dataset.
  sess.run(training_init_op)
  for _ in range(3):
    ne = sess.run(next_element)
    print(ne)

192

answered Oct 18 '22 23:10

SumNeuron

Related questions
                            
                                In-place custom object unpacking different behavior with __getitem__ python 3.5 vs python 3.6
                            
                                networkx - meaning of weight in betwenness and current flow betweenness
                            
                                Convert timeseries to image matrix
                            
                                Elementwise division of sparse matrices, ignoring 0/0
                            
                                Python: faster alternative to numpy's random.choice()?
                            
                                makemessages command results in html.py files and a UnicodeDecodeError
                            
                                Find the elements only after a specific text in html using selenium python
                            
                                How can I remove sharp jumps in data?
                            
                                Gaussian process with 2D feature array as input - scikit-learn
                            
                                Python Docker Remote Debugging VS Code
                            
                                How to execute python scripts from react-js?
                            
                                How to alter the LoginSerializer for one field for username/telephone/email?
                            
                                How to load s3 open dataset in google colaboratory?
                            
                                Streaming video in memory with OpenCV VideoWriter and Python BytesIO
                            
                                how to quit/close Anaconda Navigator
                            
                                Use multiple directories for flow_from_directory in Keras
                            
                                Python Selenium Wait for user to click a button
                            
                                How to rotate image before save in Django?
                            
                                Python Timedelta64 convert days to months
                            
                                Keras Model with Maxpooling1D and channel_first

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Tensorflow 1.10 TFRecordDataset - recovering TFRecords

Tags:

python

tensorflow

python-3.6

tensorflow-datasets

tensorflow-estimator

SumNeuron

People also ask

Video Answer

1 Answers

SumNeuron

Recent Activity

Donate For Us