Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to inspect a Tensorflow .tfrecord file?

I have a .tfrecord but I don't know how it is structured. How can I inspect the schema to understand what the .tfrecord file contains?

All Stackoverflow answers or documentation seem to assume I know the structure of the file.

reader = tf.TFRecordReader() file = tf.train.string_input_producer("record.tfrecord") _, serialized_record = reader.read(file)  ...HOW TO INSPECT serialized_record... 
like image 696
Bob van Luijt Avatar asked Feb 22 '17 14:02

Bob van Luijt


People also ask

How do I read a TFRecord file in Python?

TFRecordReader() file = tf. train. string_input_producer("record. tfrecord") _, serialized_record = reader.

What is a TFRecord file?

The TFRecord format is a simple format for storing a sequence of binary records. Protocol buffers are a cross-platform, cross-language library for efficient serialization of structured data. Protocol messages are defined by . proto files, these are often the easiest way to understand a message type.

What is the ideal size of a TFRecord file size?

Ideally, you should shard the data to ~10N files, as long as ~X/(10N) is 10+ MBs (and ideally 100+ MBs).


2 Answers

Found it!

import tensorflow as tf  for example in tf.python_io.tf_record_iterator("data/foobar.tfrecord"):     print(tf.train.Example.FromString(example)) 

You can also add:

from google.protobuf.json_format import MessageToJson ... jsonMessage = MessageToJson(tf.train.Example.FromString(example)) 
like image 63
Bob van Luijt Avatar answered Sep 30 '22 18:09

Bob van Luijt


Above solutions didn't work for me so for TF 2.0 use this:

import tensorflow as tf  raw_dataset = tf.data.TFRecordDataset("path-to-file")  for raw_record in raw_dataset.take(1):     example = tf.train.Example()     example.ParseFromString(raw_record.numpy())     print(example) 

https://www.tensorflow.org/tutorials/load_data/tfrecord#reading_a_tfrecord_file_2

like image 33
amalik2205 Avatar answered Sep 30 '22 19:09

amalik2205