Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Torchscript vs TensorRT for real time inference

I have trained an object detection model to be used in production for real-time applications. I have the following two options. Can anyone suggest what is the best way to run inference on Jetson Xavier for best performance? Any other suggestions are also welcome.

  1. Convert the model to ONXX format and use with TensorRT
  2. Save the model as Torchscript and run inference in C++
like image 492
Akshay Kumar Avatar asked Sep 16 '25 15:09

Akshay Kumar


1 Answers

On Jetson hardware, my experience is that using TensorRT is definitely faster. You can convert ONNX models to TensorRT using the ONNXParser from NVIDIA. For optimal performance you can choose to use mixed precision. How to convert ONNX to TensorRT is explained here: TensorRT. Section 3.2.5 for python bindings and Section 2.2.5 for the C++ bindings.

like image 102
joostblack Avatar answered Sep 18 '25 08:09

joostblack