Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow quantization

I would like to optimize a graph using Tensorflow's transform_graph tool. I tried optimizing the graph from MultiNet (and others with similar encoder-decoder architectures). However, the optimized graph is actually slower when using quantize_weights, and even much slower when using quantize_nodes. From Tensorflow's documentation, there may be no improvements, or it may even be slower, when quantizing. Any idea if this is normal with the graph/software/hardware below?

Here is my system information for your reference:

  • OS Platform and Distribution: Linux Ubuntu 16.04
  • TensorFlow installed from: using TF source code (CPU) for graph conversion, using binary-python(GPU) for inference
  • TensorFlow version: both using r1.3
  • Python version: 2.7
  • Bazel version: 0.6.1
  • CUDA/cuDNN version: 8.0/6.0 (inference only)
  • GPU model and memory: GeForce GTX 1080 Ti

I can post all the scripts used to reproduce if necessary.

like image 958
YannickB Avatar asked Oct 10 '17 08:10

YannickB


1 Answers

It seems like quantization in Tensorflow only happens on CPUs. See: https://github.com/tensorflow/tensorflow/issues/2807

like image 70
Benjamin Tan Wei Hao Avatar answered Nov 15 '22 08:11

Benjamin Tan Wei Hao