Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Failed to run the tflite model on Interpreter due to Internal Error

Tags:

I am trying to build an offline translator for android. My model is highly inspired from this guide: https://www.tensorflow.org/tutorials/text/nmt_with_attention. I just did some modifications to make sure the model is serialisable. (You can find the code for the model at the end)

The model works perfectly on my jupyter notebook. I am using Tensorflow version: 2.3.0-dev20200617, I also was able to generate the tflite file using the following snippet:

converter = tf.lite.TFLiteConverter.from_keras_model(partial_model)
tflite_model = converter.convert()

with tf.io.gfile.GFile('goog_nmt_v2.tflite', 'wb') as f:
  f.write(tflite_model)

However when I used the generated tflite model to get predictions on android, it throws the error java.lang.IllegalArgumentException: Internal error: Failed to run on the given Interpreter: tensorflow/lite/kernels/concatenation.cc:73 t->dims->data[d] != t0->dims->data[d] (8 != 1) Node number 84 (CONCATENATION) failed to prepare.

This is strange because I have provided the exact same input dimensions as I did in my jupyter notebook. Here is the java code that is used to test (dummy inputs) if model runs on android:

 HashMap<Integer, Object> outputVal = new HashMap<>();
        for(int i=0; i<2; i++) outputVal.put(i, new float[1][5]);
        float[][] inp_test = new float[1][8];
        float[][] enc_hidden = new float[1][1024];
        float[][] dec_input = new float[1][1];
        float[][] dec_test = new float[1][8];

        tfLite.runForMultipleInputsOutputs(new Object[] {inp_test,enc_hidden, dec_input, dec_test},outputVal);

And here are my gradle dependencies:

dependencies {
    implementation fileTree(dir: 'libs', include: ['*.jar'])

    implementation 'androidx.appcompat:appcompat:1.1.0'
    implementation 'org.tensorflow:tensorflow-lite:0.0.0-nightly'
    implementation 'org.tensorflow:tensorflow-lite-select-tf-ops:0.0.0-nightly'
    // This dependency adds the necessary TF op support.
    implementation 'androidx.constraintlayout:constraintlayout:1.1.3'
    testImplementation 'junit:junit:4.12'
    androidTestImplementation 'androidx.test.ext:junit:1.1.1'
    androidTestImplementation 'androidx.test.espresso:espresso-core:3.2.0'
}

As the error pointed, there was something wrong with dimensions at node 84. So I went ahead and visualised the tflite file using Netron. I have zoomed the concatenation node, you can find the pic of the node along with input and output dimensions here. You can find the whole generated graph here.

As it turns out, the concatenation node at location 84 isn't actually concatenating, you can see this from the input and output dimensions. It just spits out a 1X1X1 matrix after processing 1X1X1 and 1X1X256 matrix. I know tflite graph isn't same as the original model graph since a lot of operations are replaced and even removed for optimisations but this seems a little odd.

I can't relate this to the error. And if it runs prefectly on jupyter, is it a framework issue or am I missing something? Also, could anyone please explain me what does the error mean by t->dims->data[d] != t0->dims->data[d] what is d?

Please if you have answers to even any one of the question, please write it. If you require any extra details please let me know.

Here is the code for the model:


Tx = 8
def Partial_model():
    outputs = []
    X = tf.keras.layers.Input(shape=(Tx,))
    partial = tf.keras.layers.Input(shape=(Tx,))
    enc_hidden = tf.keras.layers.Input(shape=(units,))
    dec_input = tf.keras.layers.Input(shape=(1,))
    
    d_i = dec_input
    e_h = enc_hidden
    X_i = X
    
    enc_output, e_h = encoder(X, enc_hidden)
    
    
    dec_hidden = enc_hidden
    print(dec_input.shape, 'inp', dec_hidden.shape, 'dec_hidd')
    for t in range(1, Tx):
        print(t, 'tt')
      # passing enc_output to the decoder
        predictions, dec_hidden, _ = decoder(d_i, dec_hidden, enc_output)
#         outputs.append(predictions)
        print(predictions.shape, 'pred')
        d_i = tf.reshape(partial[:, t], (-1, 1))
        print(dec_input.shape, 'dec_input')
    
    predictions, dec_hidden, _ = decoder(d_i, dec_hidden, enc_output)
    d_i = tf.squeeze(d_i)
    
    outputs.append(tf.math.top_k(predictions, 5))
    
    return tf.keras.Model(inputs = [X, enc_hidden, dec_input, partial], outputs = [outputs[0][0], outputs[0][1]])




class Encoder():
  def __init__(self, vocab_size, embedding_dim, enc_units, batch_sz):
    self.batch_sz = batch_sz
    self.enc_units = enc_units
    self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
    self.gru = tf.keras.layers.GRU(self.enc_units,
                                   return_sequences=True,
                                   return_state=True,
                                   recurrent_initializer='glorot_uniform')

  def __call__(self, x, hidden):
    x = self.embedding(x)
    output, state = self.gru(x, initial_state = hidden)
    print(output.shape, hidden.shape, "out", "hid")
    return output, state


  def initialize_hidden_state(self):
    return tf.zeros((self.batch_sz, self.enc_units))



class BahdanauAttention():
  def __init__(self, units):
    self.W1 = tf.keras.layers.Dense(units)
    self.W2 = tf.keras.layers.Dense(units)
    self.V = tf.keras.layers.Dense(1)

  def __call__(self, query, values):
    # query hidden state shape == (batch_size, hidden size)
    # query_with_time_axis shape == (batch_size, 1, hidden size)
    # values shape == (batch_size, max_len, hidden size)
    # we are doing this to broadcast addition along the time axis to calculate the score
    print(query.shape, 'shape')
    query_with_time_axis = tf.expand_dims(query, 1)
    # score shape == (batch_size, max_length, 1)
    # we get 1 at the last axis because we are applying score to self.V
    # the shape of the tensor before applying self.V is (batch_size, max_length, units)
    print("2")
    score = self.V(tf.nn.tanh(
        self.W1(query_with_time_axis) + self.W2(values)))
    print("3")

    # attention_weights shape == (batch_size, max_length, 1)
    attention_weights = tf.nn.softmax(score, axis=1)

    # context_vector shape after sum == (batch_size, hidden_size)
    context_vector = attention_weights * values
    context_vector = tf.reduce_sum(context_vector, axis=1)
    
    return context_vector, attention_weights


class Decoder():
  def __init__(self, vocab_size, embedding_dim, dec_units, batch_sz):
    self.dec_units = dec_units
    self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
    self.gru = tf.keras.layers.GRU(self.dec_units,
                                   return_sequences=True,
                                   return_state=True,
                                   recurrent_initializer='glorot_uniform')
    self.fc = tf.keras.layers.Dense(vocab_size)

    # used for attention
    self.attention = BahdanauAttention(self.dec_units)

  def __call__(self, x, hidden, enc_output):
    # enc_output shape == (batch_size, max_length, hidden_size)
    context_vector, attention_weights = self.attention(hidden, enc_output)
    
    print(context_vector.shape, 'c_v', attention_weights.shape, "attention_w")

    # x shape after passing through embedding == (batch_size, 1, embedding_dim)
    x = self.embedding(x)

    # x shape after concatenation == (batch_size, 1, embedding_dim + hidden_size)
    print(x.shape, 'xshape', context_vector.shape, 'context')
    expanded_dims = tf.expand_dims(context_vector, 1)
    x = tf.concat([expanded_dims, x], axis=-1)

    # passing the concatenated vector to the GRU
    output, state = self.gru(x)

    # output shape == (batch_size * 1, hidden_size)
    output = tf.reshape(output, (-1, output.shape[2]))

    # output shape == (batch_size, vocab)
    x = self.fc(output)

    return x, state, attention_weights




like image 471
Anurag Shukla Avatar asked Jun 28 '20 19:06

Anurag Shukla


People also ask

How do you test a TFLite model in Colab?

You may use TensorFlow Lite Python interpreter to test your tflite model. It allows you to feed input data in python shell and read the output directly like you are just using a normal tensorflow model. I have answered this question here. And you can read this TensorFlow lite official guide for detailed information.


1 Answers

You can load the generated .tflite file inside python notebook and pass the same inputs as at Keras model. You have to see the exact outputs because during conversion of model there is no loss of accuracy. If there is a problem there...there will be problem during android operations. If not...everything will work fine. Use below code from Tensorflow guide to run inference in Python:

import numpy as np
import tensorflow as tf

# Load the TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="converted_model.tflite")
interpreter.allocate_tensors()

# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Test the model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)

interpreter.invoke()

# The function `get_tensor()` returns a copy of the tensor data.
# Use `tensor()` in order to get a pointer to the tensor.
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

Happy coding!

like image 100
Farmaker Avatar answered Oct 02 '22 14:10

Farmaker