I have installed Cuda 10.1 and cudnn on Ubuntu 18.04 and it seems to be installed properly as type nvcc and nvidia-smi, I get proper response:
user:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:17_PST_2019
Cuda compilation tools, release 10.1, V10.1.105
user:~$ nvidia-smi
Mon Mar 18 14:36:47 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.43 Driver Version: 418.43 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro K5200 Off | 00000000:03:00.0 On | Off |
| 26% 39C P8 14W / 150W | 225MiB / 8118MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1538 G /usr/lib/xorg/Xorg 32MiB |
| 0 1583 G /usr/bin/gnome-shell 5MiB |
| 0 3008 G /usr/lib/xorg/Xorg 100MiB |
| 0 3120 G /usr/bin/gnome-shell 82MiB |
+-----------------------------------------------------------------------------+
I have installed tensorflow using:
user:~$ sudo pip3 install --upgrade tensorflow-gpu
The directory '/home/amin/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/amin/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Requirement already up-to-date: tensorflow-gpu in /usr/local/lib/python3.6/dist-packages (1.13.1)
Requirement already satisfied, skipping upgrade: keras-applications>=1.0.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu) (1.0.7)
Requirement already satisfied, skipping upgrade: protobuf>=3.6.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu) (3.6.1)
Requirement already satisfied, skipping upgrade: wheel>=0.26 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu) (0.32.3)
Requirement already satisfied, skipping upgrade: absl-py>=0.1.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu) (0.7.0)
Requirement already satisfied, skipping upgrade: keras-preprocessing>=1.0.5 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu) (1.0.9)
Requirement already satisfied, skipping upgrade: gast>=0.2.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu) (0.2.2)
Requirement already satisfied, skipping upgrade: termcolor>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu) (1.1.0)
Requirement already satisfied, skipping upgrade: grpcio>=1.8.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu) (1.18.0)
Requirement already satisfied, skipping upgrade: tensorflow-estimator<1.14.0rc0,>=1.13.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu) (1.13.0)
Requirement already satisfied, skipping upgrade: six>=1.10.0 in /usr/lib/python3/dist-packages (from tensorflow-gpu) (1.11.0)
Requirement already satisfied, skipping upgrade: numpy>=1.13.3 in /usr/lib/python3/dist-packages (from tensorflow-gpu) (1.13.3)
Requirement already satisfied, skipping upgrade: astor>=0.6.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu) (0.7.1)
Requirement already satisfied, skipping upgrade: tensorboard<1.14.0,>=1.13.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu) (1.13.1)
Requirement already satisfied, skipping upgrade: h5py in /usr/local/lib/python3.6/dist-packages (from keras-applications>=1.0.6->tensorflow-gpu) (2.9.0)
Requirement already satisfied, skipping upgrade: setuptools in /usr/local/lib/python3.6/dist-packages (from protobuf>=3.6.1->tensorflow-gpu) (40.6.3)
Requirement already satisfied, skipping upgrade: mock>=2.0.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-estimator<1.14.0rc0,>=1.13.0->tensorflow-gpu) (2.0.0)
Requirement already satisfied, skipping upgrade: werkzeug>=0.11.15 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.14.0,>=1.13.0->tensorflow-gpu) (0.14.1)
Requirement already satisfied, skipping upgrade: markdown>=2.6.8 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.14.0,>=1.13.0->tensorflow-gpu) (3.0.1)
Requirement already satisfied, skipping upgrade: pbr>=0.11 in /usr/local/lib/python3.6/dist-packages (from mock>=2.0.0->tensorflow-estimator<1.14.0rc0,>=1.13.0->tensorflow-gpu) (5.1.1)
However when I am trying to import tensorflow I am getting error about libcublas.so.10.0:
user:~$ python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17)
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/usr/lib/python3.6/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py", line 24, in <module>
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
from tensorflow.python import pywrap_tensorflow
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/usr/lib/python3.6/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory
Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/errors
for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.
What I am missing? and How can I resolve this?
Thanks
I downloaded cuda 10.0 from the following link CUDA 10.0
Then I installed it using the following commands:
sudo dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda-10-0
I then installed cudnn v7.5.0 for CUDA 10.0 by going to link CUDNN download and you need to logon using an account.
and after choosing the correct version I downloaded via link CUDNN power link after that I added the include and lib files for cudnn as follows:
sudo cp -P cuda/targets/ppc64le-linux/include/cudnn.h /usr/local/cuda-10.0/include/
sudo cp -P cuda/targets/ppc64le-linux/lib/libcudnn* /usr/local/cuda-10.0/lib64/
sudo chmod a+r /usr/local/cuda-10.0/lib64/libcudnn*
After modified the .bashrc for lib and path of cuda 10.0, if you do not have it you need to add them into .bashrc
export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
And after all these steps, I managed to import tensorflow in python3 successfully.
If using Cuda 10.1 (as directed in https://www.tensorflow.org/install/gpu), the problem is that libcublas.so.10 was moved out of the cuda-10.1 directory and into cuda-10.2(!)
Copying from this answer: https://github.com/tensorflow/tensorflow/issues/26182#issuecomment-684993950
... libcublas.so.10 sits in /usr/local/cuda-10.2/lib64 (surprise from nvidia - installation of 10.1 installs some 10.2 stuff) but only /usr/local/cuda is in include path which points to /usr/local/cuda-10.1.
The fix is to add it to your include path:
export LD_LIBRARY_PATH=/usr/local/cuda-10.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
Note: This fix is known to work in Cuda 10.1, V10.1.243 (print your version with nvcc -V
).
CUDA 10.1 (installed as per tensorflow docs) throws can't find libcublas.so.10.0
errors. The libs exist in /usr/local/cuda-10.1/targets/x86_64-linux/lib/
but are misnamed.
There was another (lost) stackoverflow post saying this was a pinned dependency issue with the package and could be fixed with an extra cli flag to apt. This didn't seem to fix the issue for me.
Tested workaround is to modify instructions to downgrade to CUDA 10.0
# Uninstall packages from tensorflow installation instructions
sudo apt-get remove cuda-10-1 \
libcudnn7 \
libcudnn7-dev \
libnvinfer6 \
libnvinfer-dev \
libnvinfer-plugin6
# WORKS: Downgrade to CUDA-10.0
sudo apt-get install -y --no-install-recommends \
cuda-10-0 \
libcudnn7=7.6.4.38-1+cuda10.0 \
libcudnn7-dev=7.6.4.38-1+cuda10.0;
sudo apt-get install -y --no-install-recommends \
libnvinfer6=6.0.1-1+cuda10.0 \
libnvinfer-dev=6.0.1-1+cuda10.0 \
libnvinfer-plugin6=6.0.1-1+cuda10.0;
Upgrading to CUDA-10.2 also seems to suffer from the same problem
# BROKEN: Upgrade to CUDA-10.2
# use `apt show -a libcudnn7 libnvinfer7` to find 10.2 compatable version numbers
sudo apt-get install -y --no-install-recommends \
cuda-10-2 \
libcudnn7=7.6.5.32-1+cuda10.2 \
libcudnn7-dev=7.6.5.32-1+cuda10.2;
sudo apt-get install -y --no-install-recommends \
libnvinfer7=7.0.0-1+cuda10.2 \
libnvinfer-dev=7.0.0-1+cuda10.2 \
libnvinfer-plugin7=7.0.0-1+cuda10.2;
Test GPU Visibility in Python
python3
>>> import tensorflow as tf
>>> tf.test.is_gpu_available()
FutureWarnings on tensorflow import
https://github.com/tensorflow/tensorflow/issues/30427
two solutions:
pip3 install tf-nightly-gpu
pip3 install "numpy<1.17"
Update:
You also need the correct tensorflow version to match with your CUDA version
Tensorflow / CUDA version combinations:
See for the full list: https://www.tensorflow.org/install/source#tested_build_configurations
You may potentually need to reinstall tensorflow with a named version matching your CUDA
pip uninstall tensorflow tensorflow-gpu
pip install tensorflow==2.1.0 tensorflow-gpu==2.1.0
Then add CUDA to $PATH and $LD_LIBRARY_PATH in ~/.bashrc
~/.bashrc
# CUDA Environment Setup: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#environment-setup
for CUDA_BIN_DIR in `find /usr/local/cuda-*/bin -maxdepth 0`; do export PATH="$PATH:$CUDA_BIN_DIR"; done;
for CUDA_LIB_DIR in `find /usr/local/cuda-*/lib64 -maxdepth 0`; do export LD_LIBRARY_PATH="${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}$CUDA_LIB_DIR"; done;
export PATH=`echo $PATH | tr ':' '\n' | awk '!x[$0]++' | tr '\n' ':' | sed 's/:$//g'` # Deduplicate $PATH
export LD_LIBRARY_PATH=`echo $LD_LIBRARY_PATH | tr ':' '\n' | awk '!x[$0]++' | tr '\n' ':' | sed 's/:$//g'` # Deduplicate $LD_LIBRARY_PATH
This error occurs when the version of cuda and tensorflow installed are not compatible. I encountered a similar ImportError while running tensorflow version 1.13.0 with cuda 9. Since I had installed tensorflow on a virtual environment with pip, I just uninstalled tensorflow 1.13.0 and installed tensorflow 1.12.0 as follow;
pip uninstall tensorflow-gpu tensorflow-estimator tensorboard
pip install tensorflow-gpu==1.12.0
Everything now works.
As CalderBot mentioned you can do this as well
sudo cp -r /usr/local/cuda-10.2/lib64/libcu* /usr/local/cuda-10.1/lib64/
I had the correct version of CUDA and tensorflow-gpu==1.14.0
installed on my conda environment, but somehow I was still getting this error message. This post helped me to finally solve it.
I had previously installed tensorflow-gpu
via pip
- after creating a new environment and installing tensorflow-gpu
via conda
solved my problem.
conda install -c anaconda tensorflow-gpu=1.14.0
I had the same issue. I fixed it by adding the below command to the '.bashrc' file.
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.0/lib64/
System configuration:
Ubuntu 16.04 LTS
Tensorflow GPU 2.0beta1
Cuda 10.0
cuDNN 7.6.0 for Cuda 10.0
I used conda to configure my system.
The problem is caused by your current cuda version which is 10.1 (as we can see from the top right corner of your image).
As you can see from the official TF website, the correspondence between tf and cuda is: TF website for the chart
Version cuDNN CUDA
tensorflow-2.1.0 7.6 10.1
tensorflow-2.0.0 7.4 10.0
tensorflow_gpu-1.14.0 7.4 10.0
tensorflow_gpu-1.13.1 7.4 10.0
Thus, you can either upgrade your tf to 2.1 or downgrade your cuda with:
conda install cudatoolkit=10.0.130
Then it would automatically downgrade your cudnn as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With