Tensorflow will not run on GPU

Tags:

I'm a newbie when it comes to AWS and Tensorflow and I've been learning about CNNs over the last week via Udacity's Machine Learning course. Now I've a need to use an AWS instance of a GPU. I launched a p2.xlarge instance of Deep Learning AMI with Source Code (CUDA 8, Ubuntu) (that's what they recommended)

But now, it seems that tensorflow is not using the GPU at all. It's still training using the CPU. I did some searching and I found some answers to this problem and none of them seemed to work.

<code>nvidia-smi</code> gives the following output:

When I run the Jupyter notebook, it still uses the CPU

What do I do to get it to run on the GPU and not the CPU?

537

asked Dec 24 '18 12:12

Pawan Bhandarkar

1 Answers

The problem of tensorflow not detecting GPU can possibly be due to one of the following reasons.

Only the tensorflow CPU version is installed in the system.
Both tensorflow CPU and GPU versions are installed in the system, but the Python environment is preferring CPU version over GPU version.

Before proceeding to solve the issue, we assume that the installed environment is an AWS Deep Learning AMI having CUDA 8.0 and tensorflow version 1.4.1 installed. This assumption is derived from the discussion in comments.

To solve the problem, we proceed as follows:

Check the installed version of tensorflow by executing the following command from the OS terminal.

pip freeze | grep tensorflow

If only the CPU version is installed, then remove it and install the GPU version by executing the following commands.

pip uninstall tensorflow

pip install tensorflow-gpu==1.4.1

If both CPU and GPU versions are installed, then remove both of them, and install the GPU version only.

pip uninstall tensorflow

pip uninstall tensorflow-gpu

pip install tensorflow-gpu==1.4.1

At this point, if all the dependencies of tensorflow are installed correctly, tensorflow GPU version should work fine. A common error at this stage (as encountered by OP) is the missing cuDNN library which can result in following error while importing tensorflow into a python module

ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory

It can be fixed by installing the correct version of NVIDIA's cuDNN library. Tensorflow version 1.4.1 depends upon cuDNN version 6.0 and CUDA 8, so we download the corresponding version from cuDNN archive page (Download Link). We have to login to the NVIDIA developer account to be able to download the file, therefore it is not possible to download it using command line tools such as wget or curl. A possible solution is to download the file on host system and use scp to copy it onto AWS.

Once copied to AWS, extract the file using the following command:

tar -xzvf cudnn-8.0-linux-x64-v6.0.tgz

The extracted directory should have structure similar to the CUDA toolkit installation directory. Assuming that CUDA toolkit is installed in the directory /usr/local/cuda, we can install cuDNN by copying the files from the downloaded archive into corresponding folders of CUDA Toolkit installation directory followed by linker update command ldconfig as follows:

cp cuda/include/* /usr/local/cuda/include

cp cuda/lib64/* /usr/local/cuda/lib64

ldconfig

After this, we should be able to import tensorflow GPU version into our python modules.

A few considerations:

If we are using Python3, pip should be replaced with pip3.
Depending upon user privileges, the commands pip, cp and ldconfig may require to be run as sudo.

answered Jan 03 '23 23:01

T.Z

Related questions
                            
                                Difference between ClientSession and Session in TensorFlow C++ API
                            
                                Keras (TensorFlow, CPU): Training Sequential models in loop eats memory
                            
                                Understanding input/output dimensions of neural networks
                            
                                How to use tf.data's initializable iterators within a tf.estimator's input_fn?
                            
                                Saving a TF2 keras model with custom signature defs
                            
                                Could validation data be a generator in tensorflow.keras 2.0?
                            
                                How to load TF hub model from local system
                            
                                expand 1 dim vector by using taylor series of log(1+e^x) in python
                            
                                Tensorflow : Memory leak even while closing Session?
                            
                                Tensorflow: Creating a graph in a class and running it outside
                            
                                Are there any examples of anomaly detection algorithms implemented with TensorFlow?
                            
                                How to calculate perplexity of RNN in tensorflow
                            
                                What is a tensorflow session actually?
                            
                                tensorflow using 2 GPU at the same time
                            
                                How to count objects in Tensorflow Object Detection API
                            
                                Does dropout layer go before or after dense layer in TensorFlow?
                            
                                How to download large files (like weights of a model) from Colaboratory?
                            
                                Strange behaviour of the loss function in keras model, with pretrained convolutional base
                            
                                Can't import frozen graph with BatchNorm layer
                            
                                Anaconda prompt crashes as soon as I activate tensorflow env

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Tensorflow will not run on GPU

Tags:

tensorflow

gpu

Pawan Bhandarkar

People also ask

1 Answers

T.Z

Recent Activity

Donate For Us