I am trying to run a Tensorflow model on my Android application, but the same trained model gives different results (wrong inference) compared to when it is run on Python on desktop.
The model is a simple sequential CNN to recognize characters, much like this number plate recognition network, minus the windowing, as my model has the characters already cropped into place.
I have:
I used this code to save the Keras model as a .pb file.
Python code, this works as expected:
test_image = [ndimage.imread("test_image.png", mode="RGB").astype(float)/255]
imTensor = np.asarray(test_image)
def load_graph(model_file):
graph = tf.Graph()
graph_def = tf.GraphDef()
with open(model_file, "rb") as f:
graph_def.ParseFromString(f.read())
with graph.as_default():
tf.import_graph_def(graph_def)
return graph
graph=load_graph("model.pb")
with tf.Session(graph=graph) as sess:
input_operation = graph.get_operation_by_name("import/conv2d_1_input")
output_operation = graph.get_operation_by_name("import/output_node0")
results = sess.run(output_operation.outputs[0],
{input_operation.outputs[0]: imTensor})
Android code, based on this example; this gives seemingly random results:
Bitmap bitmap;
try {
InputStream stream = getAssets().open("test_image.png");
bitmap = BitmapFactory.decodeStream(stream);
} catch (IOException e) {
e.printStackTrace();
}
inferenceInterface = new TensorFlowInferenceInterface(context.getAssets(), "model.pb");
int[] intValues = new int[129*45];
float[] floatValues = new float[129*45*3];
String outputName = "output_node0";
String[] outputNodes = new String[]{outputName};
float[] outputs = new float[4*36];
bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());
for (int i = 0; i < intValues.length; ++i) {
final int val = intValues[i];
floatValues[i * 3 + 0] = ((val >> 16) & 0xFF) / 255;
floatValues[i * 3 + 1] = ((val >> 8) & 0xFF) / 255;
floatValues[i * 3 + 2] = (val & 0xFF) / 255;
}
inferenceInterface.feed("conv2d_1_input", floatValues, 1, 45, 129, 3);
inferenceInterface.run(outputNodes, false);
inferenceInterface.fetch(outputName, outputs);
Any help is greatly appreciated!
One Problem is in the lines:
floatValues[i * 3 + 0] = ((val >> 16) & 0xFF) / 255;
floatValues[i * 3 + 1] = ((val >> 8) & 0xFF) / 255;
floatValues[i * 3 + 2] = (val & 0xFF) / 255;
where the RGB values are divided by an integer, thus yielding an integer result (namely 0 every time).
Moreover, the division, even if executed with a 255.0
yielding a float between 0 and 1.0 may pose a problem, as the values aren't distributed in the projection space (0..1) like they were in Natura. To explain this: a value of 255 in the sensor domain (i.e. the R value for example) means that the natural value of the measured signal fell somewhere in the "255" bucket which is a whole range of energies/intensities/etc. Mapping this value to 1.0 will most likely cut half of its range, as subsequent calculations could saturate at a maximum multiplicator of 1.0 which really is only the midpoint of a +- 1/256 bucket. So maybe the transformation would be more correctly a mapping to the midpoints of a 256-bucket division of the 0..1 range:
((val & 0xff) / 256.0) + (0.5/256.0)
but this is just a guess from my side.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With