I do not quite understand this guide: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/instance_segmentation.md
I have many objects of three classes. According to the guide I have to make mask with dimension [N, H, W], where:
I have this function to create a mask
def image_mask(img, polygons):
w, h = img.size
n = len(polygons)
mask = np.zeros([n, h, w], dtype=np.float32)
for i in range(0, n):
polygon = polygons[i].reshape((-1, 1, 2))
tmp_mask = np.zeros([h, w], dtype=np.float32)
cv2.fillPoly(tmp_mask, [polygon], (1, 1, 1))
mask[i, :, :] = tmp_mask
return mask
I use this guide for creating my dataset: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md
I add a mask to the end of tf_example
tf_example = tf.train.Example(features=tf.train.Features(feature={
...
'image/object/class/label': dataset_util.int64_list_feature(classes),
'image/object/mask': dataset_util.bytes_list_feature(mask.reshape((-1))),
}))
Because of reshape
(I suppose), RAM quickly runs out and I get a memory error. What am I doing wrong? Maybe somewhere there is a detailed guide, how to create a mask for using Mask-RCNN and Tensorflow Object Detection API? I did not find this.
This is an old question, but it looks like you aren't converting your mask data to bytes before sending it to a bytes_list_feature.
If there are still memory issues, the 'image/object/mask'
feature can be a list of bytes strings, one for each object. If you have a very large n
, the other option (a NxHxW
array that must be manipulated after compilation) may cause memory issues.
Here's how to compile an instance map data into object masks using the bytes list option:
# a HxW array of integer instance IDs, one per each unique object
inst_map_data = func_to_get_inst_map_data(inst_map_path)
object_masks = []
class_labels = []
for inst_id in np.unique(inst_map_data): # loop through all objects
# a HxW array of 0's and 1's, 1's representing where this object is
obj_mask_data = np.where(inst_map_data==inst_id, 1, 0)
# encode as png for space saving
is_success, buffer = cv2.imencode(".png", obj_mask_data)
io_buf = io.BytesIO(buffer)
# a bytes string
obj_mask_data_bytes = io_buf.getvalue()
object_masks += [obj_mask_data_bytes]
class_labels += [int((inst_id-inst_id%1000)/1000)]
tf_example = tf.train.Example(features=tf.train.Features(feature={
...
'image/object/class/label': dataset_util.int64_list_feature(class_labels),
'image/object/mask': dataset_util.bytes_list_feature(object_masks),
}))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With