Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Same Tensorflow model giving different results on Android and Python

I am trying to run a Tensorflow model on my Android application, but the same trained model gives different results (wrong inference) compared to when it is run on Python on desktop.

The model is a simple sequential CNN to recognize characters, much like this number plate recognition network, minus the windowing, as my model has the characters already cropped into place.

I have:

  • Model saved in protobuf (.pb) file - modeled and trained in Keras on Python/Linux + GPU
  • The inference was tested on a different computer on pure Tensorflow, to make sure Keras was not the culprit. Here, the results were as expected.
  • Tensorflow 1.3.0 is being used on Python and Android. Installed from PIP on Python and jcenter on Android.
  • The results on Android do not resemble the expected outcome.
  • The input is a 129*45 RGB image, so a 129*45*3 array, and the output is a 4*36 array (representing 4 characters from 0-9 and a-z).

I used this code to save the Keras model as a .pb file.

Python code, this works as expected:

test_image = [ndimage.imread("test_image.png", mode="RGB").astype(float)/255]

imTensor = np.asarray(test_image)

def load_graph(model_file):
  graph = tf.Graph()
  graph_def = tf.GraphDef()

  with open(model_file, "rb") as f:
  with graph.as_default():

  return graph

with tf.Session(graph=graph) as sess:

    input_operation = graph.get_operation_by_name("import/conv2d_1_input")
    output_operation = graph.get_operation_by_name("import/output_node0")

    results = sess.run(output_operation.outputs[0],
                  {input_operation.outputs[0]: imTensor})

Android code, based on this example; this gives seemingly random results:

Bitmap bitmap;
try {
    InputStream stream = getAssets().open("test_image.png");
    bitmap = BitmapFactory.decodeStream(stream);
} catch (IOException e) {

inferenceInterface = new TensorFlowInferenceInterface(context.getAssets(), "model.pb");
int[] intValues = new int[129*45];
float[] floatValues = new float[129*45*3];
String outputName = "output_node0";
String[] outputNodes = new String[]{outputName};
float[] outputs = new float[4*36];

bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());
for (int i = 0; i < intValues.length; ++i) {
    final int val = intValues[i];
    floatValues[i * 3 + 0] = ((val >> 16) & 0xFF) / 255;
    floatValues[i * 3 + 1] = ((val >> 8) & 0xFF) / 255;
    floatValues[i * 3 + 2] = (val & 0xFF) / 255;

inferenceInterface.feed("conv2d_1_input", floatValues, 1, 45, 129, 3);
inferenceInterface.run(outputNodes, false);
inferenceInterface.fetch(outputName, outputs);

Any help is greatly appreciated!

like image 727
rednuht Avatar asked Aug 30 '17 13:08


1 Answers

One Problem is in the lines:

    floatValues[i * 3 + 0] = ((val >> 16) & 0xFF) / 255;
    floatValues[i * 3 + 1] = ((val >> 8) & 0xFF) / 255;
    floatValues[i * 3 + 2] = (val & 0xFF) / 255;

where the RGB values are divided by an integer, thus yielding an integer result (namely 0 every time).

Moreover, the division, even if executed with a 255.0 yielding a float between 0 and 1.0 may pose a problem, as the values aren't distributed in the projection space (0..1) like they were in Natura. To explain this: a value of 255 in the sensor domain (i.e. the R value for example) means that the natural value of the measured signal fell somewhere in the "255" bucket which is a whole range of energies/intensities/etc. Mapping this value to 1.0 will most likely cut half of its range, as subsequent calculations could saturate at a maximum multiplicator of 1.0 which really is only the midpoint of a +- 1/256 bucket. So maybe the transformation would be more correctly a mapping to the midpoints of a 256-bucket division of the 0..1 range:

((val & 0xff) / 256.0) + (0.5/256.0)

but this is just a guess from my side.

like image 105
Vroomfondel Avatar answered Nov 14 '22 23:11
