Inference with TensorRT .engine file on python

1 Answers

Python inference is possible via .engine files. Example below loads a .trt file (literally same thing as an .engine file) from disk and performs single inference.

In this project, I've converted an ONNX model to TRT model using onnx2trt executable before using it. You can even convert a PyTorch model to TRT using ONNX as a middleware.


import tensorrt as trt
import numpy as np
import os

import pycuda.driver as cuda
import pycuda.autoinit



class HostDeviceMem(object):
    def __init__(self, host_mem, device_mem):
        self.host = host_mem
        self.device = device_mem

    def __str__(self):
        return "Host:\n" + str(self.host) + "\nDevice:\n" + str(self.device)

    def __repr__(self):
        return self.__str__()

class TrtModel:
    
    def __init__(self,engine_path,max_batch_size=1,dtype=np.float32):
        
        self.engine_path = engine_path
        self.dtype = dtype
        self.logger = trt.Logger(trt.Logger.WARNING)
        self.runtime = trt.Runtime(self.logger)
        self.engine = self.load_engine(self.runtime, self.engine_path)
        self.max_batch_size = max_batch_size
        self.inputs, self.outputs, self.bindings, self.stream = self.allocate_buffers()
        self.context = self.engine.create_execution_context()

                
                
    @staticmethod
    def load_engine(trt_runtime, engine_path):
        trt.init_libnvinfer_plugins(None, "")             
        with open(engine_path, 'rb') as f:
            engine_data = f.read()
        engine = trt_runtime.deserialize_cuda_engine(engine_data)
        return engine
    
    def allocate_buffers(self):
        
        inputs = []
        outputs = []
        bindings = []
        stream = cuda.Stream()
        
        for binding in self.engine:
            size = trt.volume(self.engine.get_binding_shape(binding)) * self.max_batch_size
            host_mem = cuda.pagelocked_empty(size, self.dtype)
            device_mem = cuda.mem_alloc(host_mem.nbytes)
            
            bindings.append(int(device_mem))

            if self.engine.binding_is_input(binding):
                inputs.append(HostDeviceMem(host_mem, device_mem))
            else:
                outputs.append(HostDeviceMem(host_mem, device_mem))
        
        return inputs, outputs, bindings, stream
       
            
    def __call__(self,x:np.ndarray,batch_size=2):
        
        x = x.astype(self.dtype)
        
        np.copyto(self.inputs[0].host,x.ravel())
        
        for inp in self.inputs:
            cuda.memcpy_htod_async(inp.device, inp.host, self.stream)
        
        self.context.execute_async(batch_size=batch_size, bindings=self.bindings, stream_handle=self.stream.handle)
        for out in self.outputs:
            cuda.memcpy_dtoh_async(out.host, out.device, self.stream) 
            
        
        self.stream.synchronize()
        return [out.host.reshape(batch_size,-1) for out in self.outputs]


        
        
if __name__ == "__main__":
 
    batch_size = 1
    trt_engine_path = os.path.join("..","models","main.trt")
    model = TrtModel(trt_engine_path)
    shape = model.engine.get_binding_shape(0)

    
    data = np.random.randint(0,255,(batch_size,*shape[1:]))/255
    result = model(data,batch_size)

Stay safe y'all!

139

answered Sep 28 '22 18:09

O Vuruşkaner

Related questions
                            
                                pandas.read_csv() can apply different date formats within the same column! Is it a known bug? How to fix it?
                            
                                dtreeviz: from graphviz.backend cannot import name 'run'
                            
                                Deploy Django Channels with Docker
                            
                                Different `grad_fn` for similar looking operations in Pytorch (1.0)
                            
                                Cython: Assigning single element to multidimensional memory view slice
                            
                                How to use pandas .replace() with list of regexs while honoring list order?
                            
                                Why is there a difference between round(x) and round(np.float64(x))?
                            
                                Mask RCNN: How to add region annotation based on manually segmented image?
                            
                                Is there a way for me to see how much volume an application is outputting?
                            
                                How to do groupKfold validation and have balanced data?
                            
                                How to extract tweets location which contain specific keyword using twitter API in Python
                            
                                How to use pytest fixture outside test run?
                            
                                RDKit installation under Windows and Python3.7.4
                            
                                Why doesn't this higher-order function pass static type checking in mypy?
                            
                                Unable to install ansible due to python dependency on Ubuntu 18.04
                            
                                Avoiding module namespace pollution in Python
                            
                                Using psutil.Process.memory_info memory usage differs from Pandas.memory_usage
                            
                                Getting "bad escape" when using nltk in py3
                            
                                How to implement parallel, delayed in such a way that the parallelized for loop stops when output goes below a threshold?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Inference with TensorRT .engine file on python

Tags:

python

tensorflow

deep-learning

computer-vision

tensorrt

Sharan

People also ask

1 Answers

O Vuruşkaner

Recent Activity

Donate For Us