I have created a dataset and saved it into a TFRecord file. The thing is the pictures have different size, so I want to save the size as well with the images. So I used the TFRecordWriter and defined the features like:
example = tf.train.Example(features=tf.train.Features(feature={
'rows': _int64_feature(image.shape[0]),
'cols': _int64_feature(image.shape[1]),
'image_raw': _bytes_feature(image_raw)}))
I expected that I can read and decode the image using TFRecordReader, but the thing is I cannot get the value of rows and cols from the file because they are tensors. So how am I supposed to do to read the size dynamically and reshape the image accordingly. Thanks guys
At the same time, each file should be large enough (at least 10 MB+ and ideally 100 MB+) so that you can benefit from I/O prefetching. For example, say you have X GB of data and you plan to train on up to N hosts. Ideally, you should shard the data to ~ 10*N files, as long as ~ X/(10*N) is 10 MB+ (and ideally 100 MB+).
The TFRecord format is Tensorflow's own binary storage format. It uses Protocol buffers, a cross-platform, cross-language library for efficient serialization of structured data. Using the TFRecord format has many advantages: Efficiency: Data in the TFRecord format can take up less space than the original data.
You can call tf.reshape
with a dynamic shape
parameter.
image_rows = tf.cast(features['rows'], tf.int32)
image_cols = tf.cast(features['cols'], tf.int32)
image_data = tf.decode_raw(features['image_raw'], tf.uint8)
image = tf.reshape(image_data, tf.pack([image_rows, image_cols, 3]))
I suggest a workflow like:
TARGET_HEIGHT = 500
TARGET_WIDTH = 500
image = tf.image.decode_jpeg(image_buffer, channels=3)
image = tf.image.convert_image_dtype(image, dtype=tf.float32)
# Choose your bbox here.
bbox_begin = ... (should be (h_start, w_start, 0))
bbox_size = tf.constant((TARGET_HEIGHT, TARGET_WIDTH, 3), dtype=tf.int32)
cropped_image = tf.slice(image, bbox_begin, bbox_size)
cropped_image
has a constant tensor size, and can then be thrown into a shuffle batch.
You can dynamically access the size of the decoded image using tf.shape(image)
. You can do computations on the resulting sub-elements and then stitch them back together using something like bbox_begin = tf.pack([bbox_h_start, bbox_y_start, 0])
. Just need to insert your own logic in there for determining the start points of the crop, and what you want to do if the image starts out smaller than you want for your pipeline.
If you want to upsize only if the image is smaller than your target dimensions, you'll need to use tf.control_flow_ops.cond
or equivalent. But you could use min and max operations to set the size of your crop window so that you're returning the full image iff it's smaller than the requested dimensions, and then unconditionally resize up to 500x500. The cropped image will already be at 500x500, so the resize should become an effective no-op.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With