Does anyone know if the Tensorflow compiled executables here include AVX support? I have been running that compiled version of Tensorflow on Google Compute Engine and it is slow. Dog slow. Cold molasses slow. LA traffic slow. This article says compiling with AVX support significantly improves performance on Google Compute Engine, but when I follow the compile process on that site it fails. Just wondering if AVX is already in the executables?
No, tensorflow default distributions are built without CPU extensions, such as SSE4.1, SSE4.2, AVX, AVX2, FMA, etc, because these builds (e.g. ones from pip install tensorflow
) are intended to be compatible with as many CPUs as possible. Another argument is that even with these extensions CPU is a lot slower than a GPU, and it's expected for medium- and large-scale machine-learning training to be performed on a GPU. See also a related discussion here.
The article is right, AVX and FMA instructions significantly (up to 300%!) speed up linear algebra computation, namely dot-product, matrix multiply, convolution, etc. If you want to utilize it, I'll have to pass through compiling tensorflow from sources, which is discussed in this question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With