Cannot run tensorflow on GPU

Tags:

I want to run tensorflow code on my GPU but its not working. I have Cuda and cuDNN installed and have a compatible GPU as well.

I took this sample from the official website tutorial for GPUs here Tensorflow tutorial for GPU

Click to copy

# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))

Here is my output of it:

Click to copy

Device mapping: no known devices.
2017-10-31 16:15:40.298845: I tensorflow/core/common_runtime/direct_session.cc:300] Device mapping:

MatMul: (MatMul): /job:localhost/replica:0/task:0/cpu:0
2017-10-31 16:15:56.895802: I tensorflow/core/common_runtime/simple_placer.cc:872] MatMul: (MatMul)/job:localhost/replica:0/task:0/cpu:0
b: (Const): /job:localhost/replica:0/task:0/cpu:0
2017-10-31 16:15:56.895910: I tensorflow/core/common_runtime/simple_placer.cc:872] b: (Const)/job:localhost/replica:0/task:0/cpu:0
a_1: (Const): /job:localhost/replica:0/task:0/cpu:0
2017-10-31 16:15:56.895961: I tensorflow/core/common_runtime/simple_placer.cc:872] a_1: (Const)/job:localhost/replica:0/task:0/cpu:0
a: (Const): /job:localhost/replica:0/task:0/cpu:0
2017-10-31 16:15:56.896006: I tensorflow/core/common_runtime/simple_placer.cc:872] a: (Const)/job:localhost/replica:0/task:0/cpu:0
[[ 22.  28.]
 [ 49.  64.]]

There is no option for running on my GPU. I tried to force it to run on GPU manually using this:

Click to copy

with tf.device('/gpu:0'):
...

It gave a bunch of errors:

Click to copy

Traceback (most recent call last):
  File "/home/abhor/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1327, in _do_call
    return fn(*args)
  File "/home/abhor/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1297, in _run_fn
    self._extend_graph()
  File "/home/abhor/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1358, in _extend_graph
    self._session, graph_def.SerializeToString(), status)
  File "/home/abhor/anaconda3/lib/python3.6/contextlib.py", line 88, in __exit__
    next(self.gen)
  File "/home/abhor/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation 'MatMul_1': Operation was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/cpu:0 ]. Make sure the device specification refers to a valid device.
     [[Node: MatMul_1 = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/device:GPU:0"](a_2, b_1)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/home/abhor/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/home/abhor/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1124, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/abhor/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
    options, run_metadata)
  File "/home/abhor/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation 'MatMul_1': Operation was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/cpu:0 ]. Make sure the device specification refers to a valid device.
     [[Node: MatMul_1 = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/device:GPU:0"](a_2, b_1)]]

Caused by op 'MatMul_1', defined at:
  File "<stdin>", line 4, in <module>
  File "/home/abhor/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 1844, in matmul
    a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
  File "/home/abhor/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1289, in _mat_mul
    transpose_b=transpose_b, name=name)
  File "/home/abhor/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/home/abhor/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/abhor/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Cannot assign a device for operation 'MatMul_1': Operation was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/cpu:0 ]. Make sure the device specification refers to a valid device.
     [[Node: MatMul_1 = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/device:GPU:0"](a_2, b_1)]]

I see that in some lines it says only CPU is available.

Here are my graphic card details and Cuda versions.

Output for nvidia-smi:

Click to copy

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.81                 Driver Version: 384.81                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce 940MX       Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   43C    P0    N/A /  N/A |    274MiB /  2002MiB |     10%      Default |
+-------------------------------+----------------------+----------------------+

Output for nvcc -V

Click to copy

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

I don't know how to check for cuDNN, but I installed it the way it was given in the official documentation, so I am guessing it should be working as well.

EDIT: Output for pip3 list | grep tensorflow

Click to copy

tensorflow-gpu (1.3.0)
tensorflow-tensorboard (0.1.8)

683

asked Oct 31 '17 17:10

Video Answer

2 Answers

Try this piece of code:

Click to copy

sess = tf.Session(config=tf.ConfigProto(
      allow_soft_placement=True, log_device_placement=True))

174

answered Oct 22 '22 10:10

Luís Carlos Silva Eiras

Actually tensorflow cannot find the CUDA GPU in your situation.

Refer to the output device list there:

Click to copy

Operation was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/cpu:0 ]

This means no GPU is found. You can referring to codes here from How to get current available GPUs in tensorflow?, to list GPU (which tensorflow actually can find).

Click to copy

from tensorflow.python.client import device_lib

def get_available_gpus():
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos if x.device_type == 'GPU']

You must make sure actually found gpu/s is returned, thus tensorflow can use the gpu device.

There are many possibilities that gpu cannot be found, including but not limited, CUDA installation/settings, tensorflow versions and GPU model especially the GPU compute capability. Must checkout the tensorflow version support for a certain GPU model, and must checkout the GPU capability (for NVidia GPUs).

answered Oct 22 '22 09:10

Kelly Hwong

Related questions
                            
                                _pickle.UnpicklingError: invalid load key, 'x'
                            
                                TypeError: Object of type 'Tag' is not JSON serializable
                            
                                Install LabelImg Annotation tool in Windows
                            
                                Pandas df.itertuples renaming dataframe columns when printing
                            
                                Python, Seaborn: Plotting frequencies with zero-values
                            
                                is there any way to get samples under each leaf of a decision tree?
                            
                                Does itertools.product evaluate its arguments lazily?
                            
                                How to sort a pandas series of both index and values? [duplicate]
                            
                                Create a dataframe of permutations in pandas from list
                            
                                How to link to root page in intersphinx
                            
                                Where is my python-flask app source stored on ec2 instance deployed with elastic beanstalk?
                            
                                Difference between Kivy and Toga (Beeware project) for Cross platform in Python
                            
                                TypeError: a bytes-like object is required, not 'str' in subprocess.check_output
                            
                                ModuleNotFoundError: No module named 'import_export'
                            
                                Is it safe to call `setup()` multiple times in a single `setup.py`?
                            
                                Missing table name in IntegrityError (Django ORM)
                            
                                Is it possible to annotate a seaborn violin plot with number of observations in each group?
                            
                                pandas DataFrame to_sql Python
                            
                                pandas grouper vs time grouper
                            
                                Does Jupyter support 'read-only' notebooks?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Cannot run tensorflow on GPU

Tags:

python

tensorflow

gpu

Abhor

People also ask

Video Answer

2 Answers

Luís Carlos Silva Eiras

Kelly Hwong

Recent Activity

Donate For Us