Torchscript vs TensorRT for real time inference

Question

I have trained an object detection model to be used in production for real-time applications. I have the following two options. Can anyone suggest what is the best way to run inference on Jetson Xavier for best performance? Any other suggestions are also welcome.

Convert the model to ONXX format and use with TensorRT
Save the model as Torchscript and run inference in C++

joostblack · Accepted Answer

On Jetson hardware, my experience is that using TensorRT is definitely faster. You can convert ONNX models to TensorRT using the ONNXParser from NVIDIA. For optimal performance you can choose to use mixed precision. How to convert ONNX to TensorRT is explained here: TensorRT. Section 3.2.5 for python bindings and Section 2.2.5 for the C++ bindings.

Torchscript vs TensorRT for real time inference

Tags:

nvidia

nvidia-jetson

tensorrt

torchscript

Akshay Kumar

1 Answers

joostblack

Recent Activity

Donate For Us

Torchscript vs TensorRT for real time inference

Tags:

nvidia

nvidia-jetson

tensorrt

torchscript

Akshay Kumar

1 Answers

joostblack

Related questions

Recent Activity

Donate For Us