I have trained an object detection model to be used in production for real-time applications. I have the following two options. Can anyone suggest what is the best way to run inference on Jetson Xavier for best performance? Any other suggestions are also welcome.
On Jetson hardware, my experience is that using TensorRT is definitely faster. You can convert ONNX models to TensorRT using the ONNXParser from NVIDIA. For optimal performance you can choose to use mixed precision. How to convert ONNX to TensorRT is explained here: TensorRT. Section 3.2.5 for python bindings and Section 2.2.5 for the C++ bindings.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With