Fastest way to run recurrent neural network (inference) on mobile device

Tags:

What I have: A trained recurrent neural network in Tensorflow.

What I want: A mobile application that can run this network as fast as possible (inference mode only, no training).

I believe there are multiple ways how I can accomplish my goal, but I would like you feedback/corrections and additions because I have never done this before.

Tensorflow Lite. Pro: Straight forward, available on Android and iOS. Contra: Probably not the fastest method, right?
TensorRT. Pro: Very fast + I can write custom C code to make it faster. Contra: Used for Nvidia devices so no easy way to run on Android and iOS, right?
Custom Code + Libraries like openBLAS. Pro: Probably very fast and possibility to link to it on Android on iOS (if I am not mistaken). Contra: Is there much use for recurrent neural networks? Does it really work well on Android + iOS?
Re-implement Everything. I could also rewrite the whole computation in C/C++ which shouldn't be too hard with recurrent neural networks. Pro: Probably the fastest method because I can optimize everything. Contra: Will take a long time and if the network changes I have to update my code as well (although I am willing to do it this way if it really is the fastest). Also, how fast can I make calls to libraries (C/C++) on Android? Am I limited by the Java interfaces?

Some details about the mobile application. The application will take a sound recording of the user, do some processing (like Speech2Text) and output the text. I do not want to find a solution that is "fast enough", but the fastest option because this will happen over very large sound files. So almost every speed improvement counts. Do you have any advice, how I should approach this problem?

Last question: If I try to hire somebody to help me out, should I look for an Android/iOS-, Embedded- or Tensorflow- type of person?

589

asked Mar 09 '18 12:03

user667804

1 Answers

1. TensorflowLite

Pro: it uses GPU optimizations on Android; fairly easy to incorporate into Swift/Objective-C app, and very easy into Java/Android (just adding one line in gradle.build); You can transform TF model to CoreML

Cons: if you use C++ library - you will have some issues adding TFLite as a library to your Android/Java-JNI (there is no native way to build such library without JNI); No GPU support on iOS (community works on MPS integration tho)

Also here is reference to TFLite speech-to-text demo app, it could be useful.

2. TensorRT

It uses TensorRT uses cuDNN which uses CUDA library. There is CUDA for Android, not sure if it supports the whole functionality.

3. Custom code + Libraries

I would recommend you to use Android NNet library and CoreML; in case you need to go deeper - you can use Eigen library for linear algebra. However, writing your own custom code is not beneficial in the long term, you would need to support/test/improve it - which is a huge deal, more important than performance.

Re-implement Everything

This option is very similar to the previous one, implementing your own RNN(LSTM) should be fine, as soon as you know what you are doing, just use one of the linear algebra libraries (e.g. Eigen).

The overall recommendation would be to:**

try to do it server side: use some lossy compression and serverside speech2text;
try using Tensorflow Lite; measure performance, find bottlenecks, try to optimize
if some parts of TFLite would be too slow - reimplement them in custom operations; (and make PR to the Tensorflow)
if bottlenecks are on the hardware level - goto 1st suggestion

136

answered Oct 01 '22 17:10

Stanislav Levental

Related questions
                            
                                How large are the steps on onAdjustVolume of the VolumeProvider?
                            
                                Webview inside RecyclerView is showing blank screen sometimes on Nougat devices only
                            
                                Chromecast receiver application cannot play widevine drm protected content from Android sender application
                            
                                Fetch accurate intermediate Location points in background and calculate Distance of Starting and ending locations in Android
                            
                                How to receive Facebook Deep Link data
                            
                                Dokka - skip generating javadoc for default android packages
                            
                                Could not launch intent within 45 seconds. Perhaps the main thread has not gone idle within a reasonable amount of time?
                            
                                Could not find android.jar for API Level 26 in Visual Studio 2017
                            
                                Render website as texture in Unity3D on Android
                            
                                Refresh contentComponent in react-navigation
                            
                                Android app with realTime database without using FIREBASE maybe Something like hooking on server [closed]
                            
                                GridView with colspan and rowSpan
                            
                                New Google Cloud Module not present in Android Studio 3.0
                            
                                Unable to validate issuer when trying to access API
                            
                                Reset timer on firebase Job dispatcher
                            
                                ADWORDS Universal App campaign: 'new playstore app not shown in app list'
                            
                                BottomNavigationView text blinking on change
                            
                                Need to forget configured Wifi network programmatically in Android 6.0
                            
                                Updating aar file does not reflect changes
                            
                                Start a background service on boot in Oreo

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Fastest way to run recurrent neural network (inference) on mobile device

Tags:

performance

android

ios

tensorflow-lite

tensorrt

user667804

People also ask

1 Answers

Stanislav Levental

Recent Activity

Donate For Us