I'm running a Mask R-CNN model on an edge device (with an NVIDIA GTX 1080). I am currently using the Detectron2 Mask R-CNN implementation and I archieve an inference speed of around 5 FPS. To speed this up I looked at other inference engines and model implementations. For example ONNX, but I'm not able to gain a faster inference speed. TensorRT looks very promising to me but I did not found a ready "out-of-the-box" implementation for it. Are there any other mature and fast inference engines or other techniques to speed up the inference?

OpenCV 4.5.0 with <code>DNN_BACKEND_CUDA</code> and <code>DNN_TARGET_CUDA</code>/<code>DNN_TARGET_CUDA_FP16</code>. Mask RCNN with 1024 x 1024 input image <pre class="prettyprint"><code>Device | FPS ------------------ | ------- GTX 1080 Ti (FP32) | 29 RTX 2080 Ti (FP16) | 60 </code></pre> FPS measured includes NMS but excludes other preprocessing and postprocessing. The network fully runs end-to-end on GPU. Benchmark code: https://gist.github.com/YashasSamaga/48bdb167303e10f4d07b754888ddbdcf

What is the fastest Mask R-CNN implementation available

2 Answers

It's almost impossible to get higher inference speed for Mask R-CNN on GTX 1080. You may check detectron2 by Facebook AI Research.

Otherwise, I'd suggest to use YOLACT - (You Only Look At CoefficienTs), it can achieve real-time instance segmentation.

enter image description here

On the other hand, if you don't need instance segmentation, you can use YOLO, SSD, etc for object detection.

148

answered Oct 19 '22 13:10

kHarshit

OpenCV 4.5.0 with DNN_BACKEND_CUDA and DNN_TARGET_CUDA/DNN_TARGET_CUDA_FP16.

Mask RCNN with 1024 x 1024 input image

Device             | FPS
------------------ | -------
GTX 1080 Ti (FP32) | 29
RTX 2080 Ti (FP16) | 60

FPS measured includes NMS but excludes other preprocessing and postprocessing. The network fully runs end-to-end on GPU.

Benchmark code: https://gist.github.com/YashasSamaga/48bdb167303e10f4d07b754888ddbdcf

answered Oct 19 '22 13:10

Yashas

Related questions
                            
                                my picture after using tf.image.resize_images becomes horrible picture
                            
                                Understanding Seq2Seq model
                            
                                Protobuf version mismatch
                            
                                DataType float32 for attr 'T' not in list of allowed values: int32, int64
                            
                                Is it possible to have dynamic batchsize in keras?
                            
                                Does calling the model.fit method again reinitialize the already trained weights?
                            
                                Why does the TensorFlow Estimator API take inputs as a lambda?
                            
                                How can I use the tensorflow .pb file?
                            
                                Error when checking input: expected dense_Dense1_input to have x dimension(s). but got array with shape y,z
                            
                                Use multiple directories for flow_from_directory in Keras
                            
                                Tensorflow 1.10 TFRecordDataset - recovering TFRecords
                            
                                Cannot take the length of Shape with unknown rank
                            
                                Should I use the standalone Keras library or tf.keras?
                            
                                Save a model for TensorFlow Serving with api endpoint mapped to certain method using SignatureDefs?
                            
                                Exporting a frozen graph .pb file in Tensorflow 2
                            
                                Where is tensor_forest in tensorflow 2.0
                            
                                Detecting corrupt images in Tensorflow
                            
                                ValueError: All inputs to `ConcreteFunction`s must be Tensors
                            
                                Keras sees my GPU but doesn't use it when training a neural network
                            
                                Tensorflow: model wrapper that can release GPU resources

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the fastest Mask R-CNN implementation available

Tags:

tensorflow

deep-learning

computer-vision

pytorch

onnx

Sharif Elfouly

People also ask

2 Answers

kHarshit

Yashas

Recent Activity

Donate For Us