Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create own dataset for using Mask-RCNN models from the Tensorflow Object Detection API?

I do not quite understand this guide: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/instance_segmentation.md

I have many objects of three classes. According to the guide I have to make mask with dimension [N, H, W], where:

  • N - count of objects
  • H - image height
  • W - image width

I have this function to create a mask

def image_mask(img, polygons):
    w, h = img.size
    n = len(polygons)
    mask = np.zeros([n, h, w], dtype=np.float32)
    for i in range(0, n):
        polygon = polygons[i].reshape((-1, 1, 2))
        tmp_mask = np.zeros([h, w], dtype=np.float32)
        cv2.fillPoly(tmp_mask, [polygon], (1, 1, 1))
        mask[i, :, :] = tmp_mask
    return mask

I use this guide for creating my dataset: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md

I add a mask to the end of tf_example

tf_example = tf.train.Example(features=tf.train.Features(feature={
...
      'image/object/class/label': dataset_util.int64_list_feature(classes),
      'image/object/mask': dataset_util.bytes_list_feature(mask.reshape((-1))),
  }))

Because of reshape (I suppose), RAM quickly runs out and I get a memory error. What am I doing wrong? Maybe somewhere there is a detailed guide, how to create a mask for using Mask-RCNN and Tensorflow Object Detection API? I did not find this.

like image 378
Semyon Avatar asked Nov 08 '22 02:11

Semyon


1 Answers

This is an old question, but it looks like you aren't converting your mask data to bytes before sending it to a bytes_list_feature.

If there are still memory issues, the 'image/object/mask' feature can be a list of bytes strings, one for each object. If you have a very large n, the other option (a NxHxW array that must be manipulated after compilation) may cause memory issues.

Here's how to compile an instance map data into object masks using the bytes list option:

# a HxW array of integer instance IDs, one per each unique object
inst_map_data = func_to_get_inst_map_data(inst_map_path)
                    
object_masks = []
class_labels = []
for inst_id in np.unique(inst_map_data): # loop through all objects

    # a HxW array of 0's and 1's, 1's representing where this object is
    obj_mask_data = np.where(inst_map_data==inst_id, 1, 0)
                
    # encode as png for space saving
    is_success, buffer = cv2.imencode(".png", obj_mask_data)
    io_buf = io.BytesIO(buffer)

    # a bytes string
    obj_mask_data_bytes = io_buf.getvalue()
                        
    object_masks += [obj_mask_data_bytes]
    class_labels += [int((inst_id-inst_id%1000)/1000)]

tf_example = tf.train.Example(features=tf.train.Features(feature={
...
      'image/object/class/label': dataset_util.int64_list_feature(class_labels),
      'image/object/mask': dataset_util.bytes_list_feature(object_masks),
  }))

like image 192
Jessica Avatar answered Nov 14 '22 23:11

Jessica