Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Obtaining total number of records from .tfrecords file in Tensorflow

Is it possible for obtain the total number of records from a .tfrecords file ? Related to this, how does one generally keep track of the number of epochs that have elapsed while training models? While it is possible for us to specify the batch_size and num_of_epochs, I am not sure if it is straightforward to obtain values such as current epoch, number of batches per epoch etc - just so that I could have more control of how the training is progressing. Currently, I'm just using a dirty hack to compute this as I know before hand how many records there are in my .tfrecords file and the size of my minibatches. Appreciate any help..

like image 299
HuckleberryFinn Avatar asked Nov 07 '16 18:11

HuckleberryFinn


People also ask

How do I read a TFRecord file in Python?

TFRecordReader() file = tf. train. string_input_producer("record. tfrecord") _, serialized_record = reader.

What is the ideal size of a TFRecord file size?

The rule of thumb is to have at least 10 times as many files as there will be hosts reading data. At the same time, each file should be large enough (at least 10 MB+ and ideally 100 MB+) so that you can benefit from I/O prefetching.

What is TFRecordDataset?

This dataset loads TFRecords from the files as bytes, exactly as they were written. TFRecordDataset does not do any parsing or decoding on its own. Parsing and decoding can be done by applying Dataset. map transformations after the TFRecordDataset .


1 Answers

To count the number of records, you should be able to use tf.python_io.tf_record_iterator.

c = 0 for fn in tf_records_filenames:   for record in tf.python_io.tf_record_iterator(fn):      c += 1 

To just keep track of the model training, tensorboard comes in handy.

like image 158
drpng Avatar answered Sep 28 '22 18:09

drpng