Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to append data to TensorFlow tfrecords file

How to append new data (e.g. pairs of images and labels) to an already existing tfrecord file?

The class tf.python_io.TFRecordWriter does not seem to have any option for that.

This question may also be reformulated to how to concatenate tfrecord files.

like image 625
gizzmole Avatar asked Feb 08 '17 22:02

gizzmole


People also ask

How do I read a TFRecord file in Python?

TFRecordReader() file = tf. train. string_input_producer("record. tfrecord") _, serialized_record = reader.

What is a TFRecord file?

The TFRecord format is a simple format for storing a sequence of binary records. Protocol buffers are a cross-platform, cross-language library for efficient serialization of structured data. Protocol messages are defined by . proto files, these are often the easiest way to understand a message type.

What is the ideal size of a TFRecord file size?

Ideally, you should shard the data to ~10N files, as long as ~X/(10N) is 10+ MBs (and ideally 100+ MBs). If it is less than that, you might need to create fewer shards to trade off parallelism benefits and I/O prefetching benefits."


1 Answers

According to the comments in the ticket I opened this won't be implemented, soon.

like image 187
gizzmole Avatar answered Sep 21 '22 16:09

gizzmole