Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest Mask R-CNN implementation available

I'm running a Mask R-CNN model on an edge device (with an NVIDIA GTX 1080). I am currently using the Detectron2 Mask R-CNN implementation and I archieve an inference speed of around 5 FPS.

To speed this up I looked at other inference engines and model implementations. For example ONNX, but I'm not able to gain a faster inference speed.

TensorRT looks very promising to me but I did not found a ready "out-of-the-box" implementation for it.

Are there any other mature and fast inference engines or other techniques to speed up the inference?

like image 511
Sharif Elfouly Avatar asked Dec 18 '19 14:12

Sharif Elfouly


People also ask

How fast is mask R-CNN?

Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework.

What is better than mask R-CNN?

YOLO's performance was slightly better than Mask R-CNN, shown by 98.96% and 96.73% precision, and 80.93% and 75.43% recall, respectively. The experimental result also revealed that YOLO outperforms Mask R-CNN with mAP of 80.12% and 73.39%, respectively.

How many layers does mask R-CNN have?

The Convolutional Neural Network Architecture consists of three main layers: Convolutional layer : This layer helps to abstract the input image as a feature map via the use of filters and kernels.

What is masked R-CNN?

Mask RCNN is a deep neural network aimed to solve instance segmentation problem in machine learning or computer vision. In other words, it can separate different objects in a image or a video. You give it a image, it gives you the object bounding boxes, classes and masks.


2 Answers

It's almost impossible to get higher inference speed for Mask R-CNN on GTX 1080. You may check detectron2 by Facebook AI Research.

Otherwise, I'd suggest to use YOLACT - (You Only Look At CoefficienTs), it can achieve real-time instance segmentation.

enter image description here

On the other hand, if you don't need instance segmentation, you can use YOLO, SSD, etc for object detection.

like image 148
kHarshit Avatar answered Oct 19 '22 13:10

kHarshit


OpenCV 4.5.0 with DNN_BACKEND_CUDA and DNN_TARGET_CUDA/DNN_TARGET_CUDA_FP16.

Mask RCNN with 1024 x 1024 input image

Device             | FPS
------------------ | -------
GTX 1080 Ti (FP32) | 29
RTX 2080 Ti (FP16) | 60

FPS measured includes NMS but excludes other preprocessing and postprocessing. The network fully runs end-to-end on GPU.

Benchmark code: https://gist.github.com/YashasSamaga/48bdb167303e10f4d07b754888ddbdcf

like image 2
Yashas Avatar answered Oct 19 '22 13:10

Yashas