Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Tensorflow 1.12 compatible with CUDA 10.1?

I've been able to successfully set up an Ubuntu 18.04 server with nvidia-smi 418.39, Driver version 418.39, and CUDA 10.1

I now have a user who wants to run TensorFlow but insists that it is not compatible with CUDA 10.1, only CUDA 10. There is no statement confirming this online anywhere that I can find, nor is it in any release patch notes from TF. Because setting this system up was kind of a pain to do, I'm a little hesitant to try downgrading just one version.

Does anyone have verification whether TensorFlow 1.12 does or does not work with CUDA 10.1?

like image 249
Eric Berry Avatar asked Feb 28 '19 19:02

Eric Berry


3 Answers

I can also confirm that tf 1.13.1 does not work with CUDA 10.1. While importing tensorflow you will get the following error

ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory

running ldconfig -v shows the difference libcublas.so.10.0 vs libcublas.so.10.1.0.105

like image 108
fisakhan Avatar answered Sep 18 '22 06:09

fisakhan


I can confirm that even tf 1.13.1 only works with CUDA 10.0 for me, not 10.1. Don't know if symlink will work through. If you try to run tf 1.13.1 on CUDA 10.1, it will give you "ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory"

like image 6
user2230285 Avatar answered Oct 21 '22 03:10

user2230285


TensorFlow 1.12 (and even later versions 1.13.1 and 2.0.0-alpha0) could not be built against CUDA 10.1, thus can be considered incompatible.

I have tried building TensorFlow from source with GPU support. The TensorFlow versions I considered were 1.13.1 and 2.0.0-alpha0. The machine I used runs CentOS 7.6 with GCC 4.8.5. I have the NVIDIA Driver version 418.67 installed (which has the release date 2019.5.7 and supports CUDA Toolkit 10.1).

I succeeded in building both TensorFlow versions with CUDA 10.0 and cuDNN 7.6.0 + NCCL 2.4.7 (for CUDA 10.0). Note that you don't need to have the GPU attached to the machine (especially if you're using a VM in the cloud) while you're building TensorFlow with GPU support.

However, when I switched to CUDA 10.1 and cuDNN 7.6.0 + NCCL 2.4.7 (for CUDA 10.1), none of these TensorFlow versions could be built. Besides the changes in location of libcublas, another source of the error is no libcudart.so* are found in cuda-10.1/lib64/ (while they do exist in cuda-10.0/lib64/).

like image 4
TDT Avatar answered Oct 21 '22 02:10

TDT