Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting the annotations to COCO format from Mask-RCNN dataset format

I want to train a model that detects vehicles and roads in an image. I will use Mask R-CNN and YOLACT++ for that purpose. I labelled some of my images for Mask R-CNN with vgg image annotator and the segmentation points look like in the image below.

enter image description here

As you can see, there is not an area parameter or bbox parameter. I can find the bbox of my instances with minx miny maxx maxy but I couldn't find how to find the area of that segmented area. You can see the Yolact annotation formation in the image below.

enter image description here

It takes tons of time to label all instances. I spent a minimum 10 min while labelling all cars in an image and I already have 500 images that are labelled. Do you have any advice for me or idea that can help me to save my time while converting first annotation formation to the second one (mask r-cnn to coco(yolact))?

like image 732
Burla Nur Korkmaz Avatar asked Apr 14 '20 14:04

Burla Nur Korkmaz


People also ask

What is the Coco dataset format?

COCO Dataset Formats COCO stores data in a JSON file formatted by info, licenses, categories, images, and annotations. You can create a separate JSON file for training, testing, and validation purposes. Info: Provides a high-level description of the dataset.

How do I get Coco dataset?

To download the COCO dataset you can visit the download link on the COCO dataset page. Additionally, here is a python script to download the object detection portion of the COCO dataset to your local drive.

How to convert JSON data to CoCo data?

You can write python code to convert it json format and append it to coco dataset. You can refer coco stuff python conversion code. Sorry, something went wrong. Sorry, something went wrong.

How many float masks does mask RCNN generate?

However, in instance segmentation, masks are different. In issue #56 , the mentioned that Mask RCNN generates 28x28 float masks. According to my understanding, each class has a different mask.

Do I need the mask R-CNN code?

You will also need the Mask R-CNN code. I linked to the original Matterport implementation above, but I've forked the repo to fix a bug and also make sure that these tutorials don't break with updates. I'm sharing a dataset I created from scratch. It is COCO-like, meaning it is annotated the same way that the COCO dataset is.

Is there a way to create custom masks for new datasets?

In addition to creating masks for new datasets, one can use a pre-trained Mask RCNN model from this repo to come up with editable predicted masks to try to move towards annotation methods for instance segmentation tasks that hopefully can scale to larger datasets and can be much faster. Sorry, something went wrong.


3 Answers

Something like this but it depends on how you annotate in vgg

def vgg_to_coco(vgg_path: str, outfile: str=None, class_keyword: str = "Class"):
    with open(vgg_path) as f:
        vgg = json.load(f)

    images_ids_dict = {v["filename"]: i for i, v in enumerate(vgg.values())}
    # TDOD fix
    images_info = [{"file_name": k, "id": v, "width": 1024, "height": 1024} for k, v in images_ids_dict.items()]

    classes = {class_keyword} | {r["region_attributes"][class_keyword] for v in vgg.values() for r in v["regions"]
                                 if class_keyword in r["region_attributes"]}
    category_ids_dict = {c: i for i, c in enumerate(classes, 1)}
    categories = [{"supercategory": class_keyword, "id": v, "name": k} for k, v in category_ids_dict.items()]
    annotations = []
    suffix_zeros = math.ceil(math.log10(len(vgg)))
    for i, v in enumerate(vgg.values()):
        for j, r in enumerate(v["regions"]):
            if class_keyword in r["region_attributes"]:
                x, y = r["shape_attributes"]["all_points_x"], r["shape_attributes"]["all_points_y"]
                annotations.append({
                    "segmentation": [list(chain.from_iterable(zip(x, y)))],
                    "area": helper.polygon_area(x, y),
                    "bbox": helper.bbox(x, y, out_format="width_height"),
                    "image_id": images_ids_dict[v["filename"]],
                    "category_id": category_ids_dict[r["region_attributes"][class_keyword]],
                    "id": int(f"{i:0>{suffix_zeros}}{j:0>{suffix_zeros}}"),
                    "iscrowd": 0
                })

    coco = {
        "images": images_info,
        "categories": categories,
        "annotations": annotations
    }
    if outfile is None:
        outfile = vgg_path.replace(".json", "_coco.json")
    with open(outfile, "w") as f:
        json.dump(coco, f)

you will have to change the 1024s to your image sizes or if you have a variable image size you will have to create a map for that.

like image 157
Zac Todd Avatar answered Oct 11 '22 16:10

Zac Todd


You must create your own script and transform it, I had to do it from xml annotations to json maskrcnn.

You can check the example: https://github.com/adions025/XMLtoJson_Mask_RCNN

like image 2
Adonis González Avatar answered Oct 11 '22 17:10

Adonis González


Working solution: Extended from @Zac Tod's answer

The image size can be computed on the go.

import skimage 
import math
from itertools import chain
import numpy as np

def vgg_to_coco(dataset_dir, vgg_path: str, outfile: str=None, class_keyword: str = "label"):
    with open(vgg_path) as f:
        vgg = json.load(f)

    images_ids_dict = {}
    images_info = []
    for i,v in enumerate(vgg.values()):

        images_ids_dict[v["filename"]] = i
        image_path = os.path.join(dataset_dir, v['filename'])
        image = skimage.io.imread(image_path)
        height, width = image.shape[:2]  
        images_info.append({"file_name": v["filename"], "id": i, "width": width, "height": height})

    classes = {class_keyword} | {r["region_attributes"][class_keyword] for v in vgg.values() for r in v["regions"].values()
                             if class_keyword in r["region_attributes"]}
    category_ids_dict = {c: i for i, c in enumerate(classes, 1)}
    categories = [{"supercategory": class_keyword, "id": v, "name": k} for k, v in category_ids_dict.items()]
    annotations = []
    suffix_zeros = math.ceil(math.log10(len(vgg)))
    for i, v in enumerate(vgg.values()):
        for j, r in enumerate(v["regions"].values()):
            if class_keyword in r["region_attributes"]:
                x, y = r["shape_attributes"]["all_points_x"], r["shape_attributes"]["all_points_y"]
                annotations.append({
                    "segmentation": [list(chain.from_iterable(zip(x, y)))],
                    "area": PolyArea(x, y),
                    "bbox": [min(x), min(y), max(x)-min(x), max(y)-min(y)],
                    "image_id": images_ids_dict[v["filename"]],
                    "category_id": category_ids_dict[r["region_attributes"][class_keyword]],
                    "id": int(f"{i:0>{suffix_zeros}}{j:0>{suffix_zeros}}"),
                    "iscrowd": 0
                    })

    coco = {
        "images": images_info,
        "categories": categories,
        "annotations": annotations
        }
    if outfile is None:
        outfile = vgg_path.replace(".json", "_coco.json")
    with open(outfile, "w") as f:
        json.dump(coco, f)

My data was labeled using makesense.ai and region_attributes is like this so class_keyword="label" in function call.

"region_attributes": {
      "label": "box"
    }

To compute polygon area, Code is copied from this answer

def PolyArea(x,y):
    return 0.5*np.abs(np.dot(x,np.roll(y,1))-np.dot(y,np.roll(x,1)))
like image 1
Muhammad Haseeb Khan Avatar answered Oct 11 '22 17:10

Muhammad Haseeb Khan