Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

React native tensorflowjs very slow prediction

So I made and trained my own model in keras. Here's the model (6~ million params):

enter image description here

I converted it into a tfjs graphmodel. And implemented it into my react native app. Here's the converting command:

tensorflowjs_converter --input_format=tf_saved_model --output_format=tfjs_graph_model --weight_shard_size_bytes 60000000 /Some/Path /Some/Other/Path

My problem is that it takes a whole 2 minutes for 1 prediction when for my app purposes I need real time application speed. But on colab the prediction takes 400ms. Here's the prediction code:

let input = tf.randomNormal([50, 6]);
input = tf.reshape(input,[-1,50,6]);
const res = model.predict(input);
//const res = await this._model.executeAsync(input);
console.log(res.data());

So is there something I'm doing wrong with tfjs and the conversion ?

Is my model just too big ? And even if I make my model smaller is there any optimisation processes I can use here ?

And for this kind of model what 'size' would allow me to have a quick prediction time?

Update : so the amount of parameters for my model was absurd.

Through thorough remodeling and changes, I've made a 4k params model (still LSTM based) converted into tfjs graph model. The model.json file weighs 40.5kB and the shard file weighs 18.5kB. The model loads quickly but there's still the problem of prediction time. For such a small model my android emulator still takes 1 second for each prediction. So my question remains is there any way to make this execution faster?

like image 695
wakkko Avatar asked May 06 '26 17:05

wakkko


1 Answers

Your model is definitely too large.

Smartphone RAM and GPU memory sizes are much smaller than whatever Google is offering on Colab. Usually the neural net can be held entirely in the RAM, so math and retrieve times are fast there. When the memory being loaded gets too large, the model has to shift data from phone storage to memory whenever it wants to do a matrix multiplication, which is all the time. This is a very slow load.

To speed it up, make it smaller. There should be a breakpoint where the model begins to fit properly in memory and you get a big jump in speed. I warn you though, if you want it to work for all kinds of devices you'll need to make it really small, around ~100k params.

like image 52
Andy K Avatar answered May 09 '26 16:05

Andy K



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!