I have a trained PyTorch model that I would now like to export to Caffe2 using ONNX. This part seems fairly simple and well documented. However, I now want to "load" that model into a Java program in order to perform predictions within my program (a Flink streaming application). What is the best way to do this? I haven't been able to find any documentation on the website describing how to do this.
Checking an ONNX Model import onnx # Preprocessing: load the ONNX model model_path = "path/to/the/model. onnx" onnx_model = onnx. load(model_path) print(f"The model is:\n{onnx_model}") # Check the model try: onnx. checker.
You can train an ONNX model using ORT and Pytorch.
Click on Open Model and specify ONNX or Prototxt. Once opened, the graph of the model is displayed. By clicking on the layer, you can see the kernel size of Convolution and the names of the INPUTS and OUTPUTS blobs.
The onnx converted model will be faster than the vanilla tensorflow. I confirmed for onnx converted Pytorch BERT, it was faster than regular torchscript/torch BERT with multiple threads.
Currently it's a bit tricky but there is a way. You will need to use JavaCPP:
I will use single_relu.onnx as example:
//read ONNX
byte[] bytes = Files.readAllBytes(Paths.get("single_relu.onnx"));
ModelProto model = new ModelProto();
ParseProtoFromBytes(model, new BytePointer(bytes), bytes.length); // parse ONNX -> protobuf model
//preprocess model in any way you like (you can skip this step)
check_model(model);
InferShapes(model);
StringVector passes = new StringVector("eliminate_nop_transpose", "eliminate_nop_pad", "fuse_consecutive_transposes", "fuse_transpose_into_gemm");
Optimize(model, passes);
check_model(model);
ConvertVersion(model, 8);
BytePointer serialized = model.SerializeAsString();
System.out.println("model="+serialized.getString());
//prepare nGraph backend
Backend backend = Backend.create("CPU");
Shape shape = new Shape(new SizeTVector(1,2 ));
Tensor input =backend.create_tensor(f32(), shape);
Tensor output =backend.create_tensor(f32(), shape);
Function ng_function = import_onnx_model(serialized); // convert ONNX -> nGraph
Executable exec = backend.compile(ng_function);
exec.call(new NgraphTensorVector(output), new NgraphTensorVector(input));
//collect result to array
float[] r = new float[2];
FloatPointer p = new FloatPointer(r);
output.read(p, 0, r.length * 4);
p.get(r);
//print result
System.out.println("[");
for (int i = 0; i < shape.get(0); i++) {
System.out.print(" [");
for (int j = 0; j < shape.get(1); j++) {
System.out.print(r[i * (int)shape.get(1) + j] + " ");
}
System.out.println("]");
}
System.out.println("]");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With