I am trying to install Tensorflow but it is asking for libcusolver.so.11 and I only have libcusolver.so.10. Can someone tell me what I am doing wrong
Here are my Ubuntu, nvidia and CUDA versions
$ uname -a
$ Linux *****-dev-01 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
$nvidia-smi --query-gpu=gpu_name --format=csv|tail -n 1
GeForce GTX 1650
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Thu_Jun_11_22:26:38_PDT_2020
Cuda compilation tools, release 11.0, V11.0.194
Build cuda_11.0_bu.TC445_37.28540450_0
Here is how I am building tensorflow
$git clone https://github.com/tensorflow/tensorflow.git
$cd ./tensorflow
$git checkout tags/v2.2.0
$./configure
$bazel build --config=v2 --config=cuda --config=monolithic --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-msse4.1 --copt=-msse4.2 --copt=-Wno-sign-compare // tensorflow:libtensorflow_cc.so
Here is the error I am receiving
ERROR: An error occurred during the fetch of repository 'local_config_cuda':
Traceback (most recent call last):
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 1210
_create_local_cuda_repository(<1 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository
_find_libs(repository_ctx, <2 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs
_check_cuda_libs(repository_ctx, <2 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs
execute(repository_ctx, <1 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/remote_config/common.bzl", line 208, in execute
fail(<1 more arguments>)
Repository command failed
No library found under: /usr/local/cuda/lib64/libcusolver.so.11
ERROR: Skipping '//tensorflow:libtensorflow_cc.so': no such package '@local_config_cuda//cuda': Traceback (most recent call last):
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 1210
_create_local_cuda_repository(<1 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository
_find_libs(repository_ctx, <2 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs
_check_cuda_libs(repository_ctx, <2 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs
execute(repository_ctx, <1 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/remote_config/common.bzl", line 208, in execute
fail(<1 more arguments>)
Repository command failed
No library found under: /usr/local/cuda/lib64/libcusolver.so.11
WARNING: Target pattern parsing failed.
ERROR: no such package '@local_config_cuda//cuda': Traceback (most recent call last):
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 1210
_create_local_cuda_repository(<1 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository
_find_libs(repository_ctx, <2 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs
_check_cuda_libs(repository_ctx, <2 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs
execute(repository_ctx, <1 more arguments>)
File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/remote_config/common.bzl", line 208, in execute
fail(<1 more arguments>)
Repository command failed
No library found under: /usr/local/cuda/lib64/libcusolver.so.11
INFO: Elapsed time: 1.998s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
currently loading: tensorflow
NORMAL test.log
If you want a concrete solution, just find libcusolver.so.10 on your machine and create a link to libcusolver.so.11:
Following command solved issue for me:
sudo ln -s /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.10 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.11
Credit to: https://github.com/tensorflow/tensorflow/issues/43947
Can someone tell me what I am doing wrong
Nothing.
As noted in comments there is no version 11.0 of cuSolver in the CUDA 11.0 release. There is plainly some logic built into bazel which is automagically deriving the names of the component libraries from the major version of the toolkit it detects. That logic is not correct for the CUDA toolkit you have. I would be raising this as a bug with the developers of bazel. You might be able to explicitly override that in some way, but I can't tell you how.
If anyone comes across this issue, the problem for me was that I was using CUDA 11.0 and more recent TensorFlow versions require 11.2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With