I want to train a model that detects vehicles and roads in an image. I will use Mask R-CNN and YOLACT++ for that purpose. I labelled some of my images for Mask R-CNN with vgg image annotator and the segmentation points look like in the image below.
As you can see, there is not an area parameter or bbox parameter. I can find the bbox of my instances with minx miny maxx maxy but I couldn't find how to find the area of that segmented area. You can see the Yolact annotation formation in the image below.
It takes tons of time to label all instances. I spent a minimum 10 min while labelling all cars in an image and I already have 500 images that are labelled. Do you have any advice for me or idea that can help me to save my time while converting first annotation formation to the second one (mask r-cnn to coco(yolact))?
COCO Dataset Formats COCO stores data in a JSON file formatted by info, licenses, categories, images, and annotations. You can create a separate JSON file for training, testing, and validation purposes. Info: Provides a high-level description of the dataset.
To download the COCO dataset you can visit the download link on the COCO dataset page. Additionally, here is a python script to download the object detection portion of the COCO dataset to your local drive.
You can write python code to convert it json format and append it to coco dataset. You can refer coco stuff python conversion code. Sorry, something went wrong. Sorry, something went wrong.
However, in instance segmentation, masks are different. In issue #56 , the mentioned that Mask RCNN generates 28x28 float masks. According to my understanding, each class has a different mask.
You will also need the Mask R-CNN code. I linked to the original Matterport implementation above, but I've forked the repo to fix a bug and also make sure that these tutorials don't break with updates. I'm sharing a dataset I created from scratch. It is COCO-like, meaning it is annotated the same way that the COCO dataset is.
In addition to creating masks for new datasets, one can use a pre-trained Mask RCNN model from this repo to come up with editable predicted masks to try to move towards annotation methods for instance segmentation tasks that hopefully can scale to larger datasets and can be much faster. Sorry, something went wrong.
Something like this but it depends on how you annotate in vgg
def vgg_to_coco(vgg_path: str, outfile: str=None, class_keyword: str = "Class"):
with open(vgg_path) as f:
vgg = json.load(f)
images_ids_dict = {v["filename"]: i for i, v in enumerate(vgg.values())}
# TDOD fix
images_info = [{"file_name": k, "id": v, "width": 1024, "height": 1024} for k, v in images_ids_dict.items()]
classes = {class_keyword} | {r["region_attributes"][class_keyword] for v in vgg.values() for r in v["regions"]
if class_keyword in r["region_attributes"]}
category_ids_dict = {c: i for i, c in enumerate(classes, 1)}
categories = [{"supercategory": class_keyword, "id": v, "name": k} for k, v in category_ids_dict.items()]
annotations = []
suffix_zeros = math.ceil(math.log10(len(vgg)))
for i, v in enumerate(vgg.values()):
for j, r in enumerate(v["regions"]):
if class_keyword in r["region_attributes"]:
x, y = r["shape_attributes"]["all_points_x"], r["shape_attributes"]["all_points_y"]
annotations.append({
"segmentation": [list(chain.from_iterable(zip(x, y)))],
"area": helper.polygon_area(x, y),
"bbox": helper.bbox(x, y, out_format="width_height"),
"image_id": images_ids_dict[v["filename"]],
"category_id": category_ids_dict[r["region_attributes"][class_keyword]],
"id": int(f"{i:0>{suffix_zeros}}{j:0>{suffix_zeros}}"),
"iscrowd": 0
})
coco = {
"images": images_info,
"categories": categories,
"annotations": annotations
}
if outfile is None:
outfile = vgg_path.replace(".json", "_coco.json")
with open(outfile, "w") as f:
json.dump(coco, f)
you will have to change the 1024s to your image sizes or if you have a variable image size you will have to create a map for that.
You must create your own script and transform it, I had to do it from xml annotations to json maskrcnn.
You can check the example: https://github.com/adions025/XMLtoJson_Mask_RCNN
Working solution: Extended from @Zac Tod's answer
The image size can be computed on the go.
import skimage
import math
from itertools import chain
import numpy as np
def vgg_to_coco(dataset_dir, vgg_path: str, outfile: str=None, class_keyword: str = "label"):
with open(vgg_path) as f:
vgg = json.load(f)
images_ids_dict = {}
images_info = []
for i,v in enumerate(vgg.values()):
images_ids_dict[v["filename"]] = i
image_path = os.path.join(dataset_dir, v['filename'])
image = skimage.io.imread(image_path)
height, width = image.shape[:2]
images_info.append({"file_name": v["filename"], "id": i, "width": width, "height": height})
classes = {class_keyword} | {r["region_attributes"][class_keyword] for v in vgg.values() for r in v["regions"].values()
if class_keyword in r["region_attributes"]}
category_ids_dict = {c: i for i, c in enumerate(classes, 1)}
categories = [{"supercategory": class_keyword, "id": v, "name": k} for k, v in category_ids_dict.items()]
annotations = []
suffix_zeros = math.ceil(math.log10(len(vgg)))
for i, v in enumerate(vgg.values()):
for j, r in enumerate(v["regions"].values()):
if class_keyword in r["region_attributes"]:
x, y = r["shape_attributes"]["all_points_x"], r["shape_attributes"]["all_points_y"]
annotations.append({
"segmentation": [list(chain.from_iterable(zip(x, y)))],
"area": PolyArea(x, y),
"bbox": [min(x), min(y), max(x)-min(x), max(y)-min(y)],
"image_id": images_ids_dict[v["filename"]],
"category_id": category_ids_dict[r["region_attributes"][class_keyword]],
"id": int(f"{i:0>{suffix_zeros}}{j:0>{suffix_zeros}}"),
"iscrowd": 0
})
coco = {
"images": images_info,
"categories": categories,
"annotations": annotations
}
if outfile is None:
outfile = vgg_path.replace(".json", "_coco.json")
with open(outfile, "w") as f:
json.dump(coco, f)
My data was labeled using makesense.ai
and region_attributes is like this so class_keyword="label"
in function call.
"region_attributes": {
"label": "box"
}
To compute polygon area, Code is copied from this answer
def PolyArea(x,y):
return 0.5*np.abs(np.dot(x,np.roll(y,1))-np.dot(y,np.roll(x,1)))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With