Quantize a Keras neural network model

Tags:

Recently, I've started creating neural networks with Tensorflow + Keras and I would like to try the quantization feature available in Tensorflow. So far, experimenting with examples from TF tutorials worked just fine and I have this basic working example (from https://www.tensorflow.org/tutorials/keras/basic_classification):

Click to copy

import tensorflow as tf
from tensorflow import keras

fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

# fashion mnist data labels (indexes related to their respective labelling in the data set)
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

# preprocess the train and test images
train_images = train_images / 255.0
test_images = test_images / 255.0

# settings variables
input_shape = (train_images.shape[1], train_images.shape[2])

# create the model layers
model = keras.Sequential([
keras.layers.Flatten(input_shape=input_shape),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])

# compile the model with added settings
model.compile(optimizer=tf.train.AdamOptimizer(),
          loss='sparse_categorical_crossentropy',
          metrics=['accuracy'])

# train the model
epochs = 3
model.fit(train_images, train_labels, epochs=epochs)

# evaluate the accuracy of model on test data
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('Test accuracy:', test_acc)

Now, I would like to employ quantization in the learning and classification process. The quantization documentation (https://www.tensorflow.org/performance/quantization) (the page is no longer available since cca September 15, 2018) suggests to use this piece of code:

Click to copy

loss = tf.losses.get_total_loss()
tf.contrib.quantize.create_training_graph(quant_delay=2000000)
optimizer = tf.train.GradientDescentOptimizer(0.00001)
optimizer.minimize(loss)

However, it does not contain any information about where this code should be utilized or how it should be connected to a TF code (not even mentioning a high level model created with Keras). I have no idea how this quantization part relates to the previously created neural network model. Just inserting it following the neural network code runs into the following error:

Click to copy

Traceback (most recent call last):
  File "so.py", line 41, in <module>
    loss = tf.losses.get_total_loss()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/losses/util.py", line 112, in get_total_loss
    return math_ops.add_n(losses, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py", line 2119, in add_n
    raise ValueError("inputs must be a list of at least one Tensor with the "
ValueError: inputs must be a list of at least one Tensor with the same dtype and shape

Is it possible to quantize a Keras NN model in this way or am I missing something basic? A possible solution that crossed my mind could be using low level TF API instead of Keras (needing to do quite a bit of work to construct the model), or maybe trying to extract some of the lower level methods from the Keras models.

535

asked Sep 10 '18 13:09

sikr_

2 Answers

As mentioned in other answers, TensorFlow Lite can help you with network quantization.

TensorFlow Lite provides several levels of support for quantization.

Tensorflow Lite post-training quantization quantizes weights and activations post training easily. Quantization-aware training allows for training of networks that can be quantized with minimal accuracy drop; this is only available for a subset of convolutional neural network architectures.

So first, you need to decide whether you need post-training quantization or quantization-aware training. For example, if you already saved the model as *.h5 files, you would probably want to follow @Mitiku's instruction and do the post-training quantization.

If you prefer to achieve higher performance by simulating the effect of quantization in training (using the method you quoted in the question), and your model is in the subset of CNN architecture supported by quantization-aware training, this example may help you in terms of interaction between Keras and TensorFlow. Basically, you only need to add this code between model definition and its fitting:

Click to copy

sess = tf.keras.backend.get_session()
tf.contrib.quantize.create_training_graph(sess.graph)
sess.run(tf.global_variables_initializer())

answered Oct 04 '22 23:10

Jianyu

As your network looks quite simple, you can maybe use Tensorflow lite.

answered Oct 05 '22 01:10

Baptiste Pouthier

Related questions
                            
                                Twitter: How to extract tweets containing symbols (!,%,$)?
                            
                                Float16 slower than float32 in keras
                            
                                How to develop an Avahi client/server
                            
                                resources for learning/understanding Python's asyncio [closed]
                            
                                Why does creating this memoryview raise a ValueError only when assigning to a variable?
                            
                                Overriding Sphinx autodoc "Alias of" for import of private class?
                            
                                breakpoint() using ipdb by default
                            
                                Not able to connect to grpc from nodejs/python client
                            
                                How to specify the type of pandas series elements in type hints?
                            
                                How can I find the alpha shape (concave hull) of a 2d point cloud?
                            
                                send code from vim to an external application for execution
                            
                                Importing Netbeans keymap to Eclipse
                            
                                How do I simulate connection errors and request timeouts in python unit tests
                            
                                Python works in PyCharm but not from terminal
                            
                                Create a canonical "parent" product in Django Oscar programmatically
                            
                                Finding anonymous enums with libclang
                            
                                Django application 504 error after saving model
                            
                                UNABLE to load uWSGI plugin: ./python3_plugin.so: cannot open shared object file: No such file or directory
                            
                                Unittesting with Pyspark: unclosed socket warnings
                            
                                Keras model.fit() with tf.dataset API + validation_data

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Quantize a Keras neural network model

Tags:

python

neural-network

tensorflow

keras

quantization

sikr_

People also ask

2 Answers

Jianyu

Baptiste Pouthier

Recent Activity

Donate For Us