I have been experimenting with tensorflow Datasets but I cannot figure out how to efficiently create RLE-masks. FYI, I am using data from the Airbus Ship Detection Challenge in Kaggle: https://www.kaggle.com/c/airbus-ship-detection/data
I know my RLE-decoding function works (borrowed) from one of the kernels:
def rle_decode(mask_rle, shape=(768, 768)):
'''
mask_rle: run-length as string formated (start length)
shape: (height,width) of array to return
Returns numpy array, 1 - mask, 0 - background
'''
if not isinstance(mask_rle, str):
img = np.zeros(shape[0]*shape[1], dtype=np.uint8)
return img.reshape(shape).T
s = mask_rle.split()
starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
starts -= 1
ends = starts + lengths
img = np.zeros(shape[0]*shape[1], dtype=np.uint8)
for lo, hi in zip(starts, ends):
img[lo:hi] = 1
return img.reshape(shape).T
.... BUT it does not seem to play nicely with the pipeline:
list_ds = tf.data.Dataset.list_files(train_paths_abs)
ds = list_ds.map(parse_img)
With the following parse function, everything works fine:
def parse_img(file_path,new_size=[128,128]):
img_content = tf.io.read_file(file_path)
img = tf.image.decode_jpeg(img_content)
img = tf.image.convert_image_dtype(img, tf.float32)
img = tf.image.resize(img,new_size)
return img
But things go rogue if I include the mask:
def parse_img(file_path,new_size=[128,128]):
# Image
img_content = tf.io.read_file(file_path)
img = tf.image.decode_jpeg(img_content)
img = tf.image.convert_image_dtype(img, tf.float32)
img = tf.image.resize(img,new_size)
# Mask
file_id = tf.strings.split(file_path,'/')[-1]
objects = [rle_decode(m) for m in df2[df.ImageId==file_id]]
mask = np.sum(objects,axis=0)
mask = np.expand_dims(mask,3) # Force mask to have 3 channels, necessary for resize step
mask = tf.image.convert_image_dtype(mask, tf.int8)
mask = tf.clip_by_value(mask,0,1)
mask = tf.image.resize(mask,new_size)
mask = tf.squeeze(mask) # squeeze back
mask = tf.image.convert_image_dtype(mask, tf.int8)
return img, mask
Although my parse_img
function works fine (I have checked it on a sample, it takes 271 µs ± 67.9 µs per run); the list_ds.map
step takes forever (>5 minutes) before hanging.
I can't figure out what's wrong and it drives me crazy!
Any idea?
The RLE decompression consists in browsing the message formed of pairs (character, number of repetition) and writing the equivalent text by writing the character the corresponding number of times.
RLE is run-length encoding. It is used to encode the location of foreground objects in segmentation. Instead of outputting a mask image, you give a list of start pixels and how many pixels after each of those starts is included in the mask.
The best case is when 128 identical characters follow each other, this is compressed into 2 bytes instead of 128 giving a compression ratio of 64. For this reason RLE is most often used to compress black and white or 8 bit indexed colour images where long runs are likely.
August 12, 2021. Run Length Encoding is a lossless data compression algorithm. It compresses data by reducing repetitive, and consecutive data called runs. It does so by storing the number of these runs followed by the data.
You can rewrite the function rle_decode
with tensorflow like this (here I do not do the final transposition to keep it more general, but you can do it later):
import tensorflow as tf
def rle_decode_tf(mask_rle, shape):
shape = tf.convert_to_tensor(shape, tf.int64)
size = tf.math.reduce_prod(shape)
# Split string
s = tf.strings.split(mask_rle)
s = tf.strings.to_number(s, tf.int64)
# Get starts and lengths
starts = s[::2] - 1
lens = s[1::2]
# Make ones to be scattered
total_ones = tf.reduce_sum(lens)
ones = tf.ones([total_ones], tf.uint8)
# Make scattering indices
r = tf.range(total_ones)
lens_cum = tf.math.cumsum(lens)
s = tf.searchsorted(lens_cum, r, 'right')
idx = r + tf.gather(starts - tf.pad(lens_cum[:-1], [(1, 0)]), s)
# Scatter ones into flattened mask
mask_flat = tf.scatter_nd(tf.expand_dims(idx, 1), ones, [size])
# Reshape into mask
return tf.reshape(mask_flat, shape)
A small test (TensorFlow 2.0):
mask_rle = '1 2 4 3 9 4 15 5'
shape = [4, 6]
# Original NumPy function
print(rle_decode(mask_rle, shape))
# [[1 0 0 1]
# [1 0 0 0]
# [0 1 1 0]
# [1 1 1 0]
# [1 1 1 0]
# [1 1 1 0]]
# TensorFlow function (transposing is done out of the function)
tf.print(tf.transpose(rle_decode_tf(mask_rle, shape)))
# [[1 0 0 1]
# [1 0 0 0]
# [0 1 1 0]
# [1 1 1 0]
# [1 1 1 0]
# [1 1 1 0]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With