I would like to optimize a graph using Tensorflow's transform_graph tool. I tried optimizing the graph from MultiNet (and others with similar encoder-decoder architectures). However, the optimized graph is actually slower when using quantize_weights, and even much slower when using quantize_nodes. From Tensorflow's documentation, there may be no improvements, or it may even be slower, when quantizing. Any idea if this is normal with the graph/software/hardware below?
Here is my system information for your reference:
I can post all the scripts used to reproduce if necessary.
It seems like quantization in Tensorflow only happens on CPUs. See: https://github.com/tensorflow/tensorflow/issues/2807
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With