Description of TF Lite's Toco converter args for quantization aware training

Tags:

These days I am trying to track down an error concerning the deployment of a TF model with TPU support.

I can get a model without TPU support running, but as soon as I enable quantization, I get lost.

I am in the following situation:

Created a model and trained it
Created an eval graph of the model
Froze the model and saved the result as protocol buffer
Successfully converted and deployed it without TPU support

For the last point, I used the TFLiteConverter's Python API. The script that produces a functional tflite model is

import tensorflow as tf  graph_def_file = 'frozen_model.pb' inputs = ['dense_input'] outputs = ['dense/BiasAdd']  converter = tf.lite.TFLiteConverter.from_frozen_graph(graph_def_file, inputs, outputs) converter.inference_type = tf.lite.constants.FLOAT input_arrays = converter.get_input_arrays()  converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]  tflite_model = converter.convert()  open('model.tflite', 'wb').write(tflite_model)

This tells me that my approach seems to be ok up to this point. Now, if I want to utilize the Coral TPU stick, I have to quantize my model (I took that into account during training). All I have to do is to modify my converter script. I figured that I have to change it to

import tensorflow as tf  graph_def_file = 'frozen_model.pb' inputs = ['dense_input'] outputs = ['dense/BiasAdd']  converter = tf.lite.TFLiteConverter.from_frozen_graph(graph_def_file, inputs, outputs) converter.inference_type = tf.lite.constants.QUANTIZED_UINT8      ## Indicates TPU compatibility input_arrays = converter.get_input_arrays()  converter.quantized_input_stats = {input_arrays[0]: (0., 1.)}     ## mean, std_dev converter.default_ranges_stats = (-128, 127)                      ## min, max values for quantization (?) converter.allow_custom_ops = True                                 ## not sure if this is needed  ## REMOVED THE OPTIMIZATIONS ALTOGETHER TO MAKE IT WORK  tflite_model = converter.convert()  open('model.tflite', 'wb').write(tflite_model)

This tflite model produces results when loaded with the Python API of the interpreter, but I am not able to understand their meaning. Also, there is no (or if there is, it is hidden well) documentation on how to choose mean, std_dev and the min/max ranges. Also, after compiling this with the edgetpu_compiler and deploying it (loading it with the C++ API), I receive an error:

INFO: Initialized TensorFlow Lite runtime. ERROR: Failed to prepare for TPU. generic::failed_precondition: Custom op already assigned to a different TPU. ERROR: Node number 0 (edgetpu-custom-op) failed to prepare.  Segmentation fault

I suppose I missed a flag or something during the conversion process. But as the documentation is also lacking here, I can't say for sure.

In short:

What do the params mean, std_dev, min/max do and how do they interact?
What am I doing wrong during the conversion?

I am grateful for any help or guidance!

EDIT: I have opened a github issue with the full test code. Feel free to play around with this.

775

asked Jul 17 '19 14:07

DocDriven

1 Answers

You should never need to manually set the quantization stats.

Have you tried the post-training-quantization tutorials?

https://www.tensorflow.org/lite/performance/post_training_integer_quant

Basically they set the quantization options:

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8] converter.inference_input_type = tf.uint8 converter.inference_output_type = tf.uint8

Then they pass a "representative dataset" to the converter, so that the converter can run the model a few batches to gather the necessary statistics:

def representative_data_gen():   for input_value in mnist_ds.take(100):     yield [input_value]  converter.representative_dataset = representative_data_gen

While there are options for quantized training, it's always easier to to do post-training quantization.

105

answered Oct 04 '22 00:10

mdaoust

Related questions
                            
                                How to generate a temporary url to upload file to Amazon S3 with boto library?
                            
                                How does pgBouncer help to speed up Django
                            
                                mysql_config not found when installing mysqldb python interface for mariadb 10 Ubuntu 13.10
                            
                                Invalid Syntax error when running python from inside Visual Studio Code
                            
                                Pandas: slow date conversion
                            
                                Installing opencv on Windows 10 with python 3.6 and anaconda 3.6
                            
                                Using regex to remove comments from source files
                            
                                Paramiko "Unknown Server"
                            
                                Computing N Grams using Python
                            
                                Print series of prime numbers in python
                            
                                Finding matching keys in two large dictionaries and doing it fast
                            
                                list.append or list +=?
                            
                                Django migrate : doesn't create tables
                            
                                Generating unique, ordered Pythagorean triplets
                            
                                Django sort by distance
                            
                                Using Jython through IPython: is readline still an issue?
                            
                                Getting all task IDs from nested chains and chords
                            
                                Is `scipy.misc.comb` faster than an ad-hoc binomial computation?
                            
                                How to use Google Colaboratory server as python interpreter in Python IDE?
                            
                                Accessing stream output from hdfs of MRjob

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Description of TF Lite's Toco converter args for quantization aware training

Tags:

python

python-3.x

tensorflow

tensorflow-lite

DocDriven

People also ask

1 Answers

mdaoust

Recent Activity

Donate For Us