Logo Questions Linux Laravel Mysql Ubuntu Git Menu

What are the Tensorflow qint8, quint8, qint32, qint16, and quint16 datatypes?

I'm looking at the Tensorflow tf.nn.quantized_conv2d function and I'm wondering what exactly the qint8, etc. dataypes are, particularly if they are the datatypes used for the "fake quantization nodes" in tf.contrib.quantize or are actually stored using 8 bits (for qint8) in memory.

I know that they are defined in tf.dtypes.DType, but that doesn't have any information about what they actually are.

like image 930
albertNod Avatar asked Jul 30 '19 20:07


1 Answers

These are the data types of the output Tensor of the function, tf.quantization.quantize(). This corresponds to the Argument, T of the function.

Mentioned below is the underlying code, which converts/quantizes a Tensor from one Data Type (e.g. float32) to another (tf.qint8, tf.quint8, tf.qint32, tf.qint16, tf.quint16).

out[i] = (in[i] - min_range) * range(T) / (max_range - min_range)
if T == qint8: out[i] -= (range(T) + 1) / 2.0

Then, they can be passed to functions like tf.nn.quantized_conv2d, etc.., whose input is a Quantized Tensor, explained above.

TLDR, to answer your question in short, they are actually stored 8 bits (for qint8) in memory.

You can find more information about this topic in the below links:




If you feel this answer is useful, kindly accept this answer and/or up vote it. Thanks.

like image 126
Tensorflow Support Avatar answered Oct 03 '22 14:10

Tensorflow Support